Hello,
I am writing an app which requests data from a REST API multiple times a second. At the moment I am testing in a loop which collects data, sleeps until the next .1 of second and then continues collecting data.
Below is some slightly modified, abbreviated and cleaned up code which does what I want.
It works fine most of the time. I am wanting to make the code robust against network failures.
I am not sure how to do that. When running it on my laptop I will disconnect my wifi. It produces no output while the wifi is disconnected. I get no output from the request which as to be expected. But I also get no errors. It just sits idle. I don't like that. I can imagine there is some code built into HttpClient that also attempts to be robust. However, I do not know how or when it will fail.
If there is network failure I want to be able to resume as soon as it is back up.
The app sometime seems to just go comatose. Even after the network is back up, it does not resume, nor give any kind of exceptions. Which is why I wrapped this proc in a try/except and put in many echo statments. How do I debug this when disconnected? How do I make this robust to failure where I can resume immediately after any sort of network failure when the network returns?
Any help and wisdom greatly appreciated.
Thanks.
var client = newHttpClient()
proc secondLoop() =
try:
dt = getTime().utc
let pathQuery = "my path and query string"
echo("before secondLoop1: ", getTime().utc)
let result = client.get(pathQuery)
echo("in secondLoop2: ", getTime().utc)
echo("print some of the result")
let diff = getTime().utc - dt
if diff < deciSecondDuration:
sleep(diff.milliseconds.int)
echo("after secondLoop3: ", getTime().utc)
except:
echo("exception message: ", getCurrentExceptionMsg())
afaik, get proc raise TimeoutError, if it's not like that then it should be bug. If you can provide reproducible case it would help team a lot.
Because, when some request is just hanging there, it's possible too the server doesn't return the response due to logic-error or bug in there.
I have a plain except:. It should catch everything. But nothing is being caught.
It isn't a server problem. I am testing what happens to the program if there are network problems like when a network goes down briefly. In this instance. I am simply turning off my wifi while it is running.
I was hoping to find what Errors get raised that I could catch and simply have a loop which sleeps a little while, then tries again, until I have success.
I can work on writing a sample app. It shouldn't be difficult. But it can't be the one I am using at the moment because it is for a secured api which requires an account and credentials.
But again the key part is nothing is happening when I disconnect the network. It just goes comatose. No errors. The server side is fine. When my network is up, the app works flawlessly.
If the outage is short enough, 2 minutes and under from what I have briefly experimented with. It simply resumes. But my app has not been informed of any issues and I have missed ??? amount of data during the outage.
I am at a loss on how to make this fault tolerant and how to debug.
Thanks.
Two interrelated things/problems I see is that a) you work on too high a level for what you want and b) you do not really have a basis to decide what's in the "healthy" (desired) range and what's not.
Maybe my point gets more clear by this: There are timing out functions available in Nim but probably (don't know, must guess because I don't do http stuff on a convenience level) not available on the level you use (http.get etc).
That said it seems to me that the level you use is fine -but- it's meant for easy quick and dirty "get something from a web page" and not for the kind of control you need.
Maybe (sorry, didn't think a lot about it) one helpful approach would be to fire off an async sleep right before your http get call and upon the sleeper being finished you could check whether the http get already delivered a result or not (and hence there probably is a network problem).
You are probably right. I probably do need to drop a level.
Since yours and @dom96 replies I have experimented some more.
I have tried catching, TimeoutError, IOError, OSError. I have reduced the timeout in the HttpClient to 500millis.
Sometimes it catches the errors. Sometimes none show up in my try/except.
Most of my experience the higher level library isn't totally protected from the errors and gets a chance to handle them. Which is what I was thinking I was going to get here.
SIGINT: Interrupted by Ctrl-C.
SIGPIPE: Pipe closed.
Some sort of IO Error
I have a few parallel implementations of this app in various languages to see which I like the best.
I want to thank everyone for their help. I think at this present time and the schedule I have to implement this app, that Nim is not the best match for me. This is not anything against Nim. Simply that Nim and I are not an ideal team at the moment.
Thanks.
SIGINT: Interrupted by Ctrl-C.
SIGPIPE: Pipe closed.
Some sort of IO Error
Can you please paste the full error stack trace that you got?
Hello,
My apologies that terminal session is gone.
Here is what I am attempting. Here is some code that I would like to see run indefinitely.
I would think that while I have it running and I turn off my network that at some point I would see an Error raised. I almost never do.
My naive understanding of what I wrote is that if any Errors would work up to my code then it could loop correctly and checking the connection and eventually when the network is restored continue getting Google.
I am not understanding why this isn't working and why I am not seeing the Errors.
Maybe this is enough information that you who are far, far more knowledgeable than I can see the error in my ways and put me on the right path. Wisdom and knowledge gratefully accepted.
Thanks.
import httpclient, httpcore, net, times, strutils
from os import sleep
let
resetDuration = initDuration(seconds=2)
deciSecondDuration* = initDuration(milliseconds = 100)
qtrsecondDuration* = initDuration(milliseconds = 250)
var
client = newHttpClient()
lastConnection = getTime().utc
proc resetHttpClient() =
if (getTime().utc - lastConnection) > resetDuration:
# Probably a new timeout. We have not yet experienced a long outage.
# We may however be entering an extended outage.
# Creating the new clients seems to use up lots of CPU.
# I want to do that as little as possible.
try:
client.close()
except:
echo("Attempted to close clients. Probably do not exist.")
echo("Current exception: ", getCurrentExceptionMsg())
client = newHttpClient(timeout=500)
proc getGoogle() =
resetHttpClient()
var dt = getTime().utc
let enddate = dt + initDuration(days = 1)
try:
while dt <= enddate:
echo(dt)
echo(client.get("http://www.google.com").body[0..14], " ", getTime().utc)
let diff = getTime().utc - dt
if diff < deciSecondDuration:
sleep(diff.milliseconds.int)
dt = getTime().utc
except TimeoutError, IOError, OSError:
# I naively think I would see this thrown or the plain except below.
# But I almost never see an Error raised.
echo("Current Exception: ", getCurrentException().name)
echo("Current Exception Msg: ", getCurrentExceptionMsg())
echo("Sleeping for 1 seconds at: ", getTime().utc)
sleep(1000)
resetHttpClient()
except:
echo("Current Exception: ", getCurrentException().name)
echo("Current Exception Msg: ", getCurrentExceptionMsg())
echo("Sleeping for 1 seconds at: ", getTime().utc)
when isMainModule:
echo("Executing network_test")
getGoogle()
If I turn of my network and then Ctrl-C after a few seconds I get this:
--(Is there a style code for something like this terminal session output?)
<!doctype html> 2018-11-30T22:35:27Z 2018-11-30T22:35:27Z ^CTraceback (most recent call last) proxyexe.nim(62) proxyexe proxyexe.nim(49) main osproc.nim(1136) waitForExit SIGINT: Interrupted by Ctrl-C. Traceback (most recent call last) network_test.nim(48) network_test network_test.nim(33) getGoogle httpclient.nim(1235) get httpclient.nim(1227) request httpclient.nim(1204) request httpclient.nim(1189) requestAux httpclient.nim(1024) parseResponse net.nim(1312) recvLine net.nim(1272) readLine net.nim(1069) recv net.nim(1056) readIntoBuf net.nim(1052) uniRecv SIGINT: Interrupted by Ctrl-C. Error: execution of an external program failed: '/home/jimmie/Dev/Nim/network_test/network_test '
After almost 16 minutes I get this output.
2018-11-30T22:44:21Z <!doctype html> 2018-11-30T22:44:21Z 2018-11-30T22:44:21Z Current Exception: OSError Current Exception Msg: Invalid argument Sleeping for 1 seconds at: 2018-11-30T23:00:06Z
change your 10th line
var
client = newHttpClient()
lastConnection = getTime().utc
change to
var
client = newHttpClient(timeout=500)
lastConnection = getTime().utc
Thanks for the reply.
I don't know if that code makes any difference or not. But it is correct and I made the change. And whether or not that change did anything when I ran the code again I did get an error right up front. And then I noticed an error in the code I posted. I have the "while" loop inside the try/except. It never had a chance to loop. I moved it outside the loop and the code now does what I want. It loops indefinitely.
However, I went back and looked at my original code. And it did not have such a problem. However, I did notice a different logical error on my part. But the api is only open 22:00utc Sunday to 22:00utc Friday.
I will test tomorrow on Sunday.
Thanks.
Mark this one as Solved.
I want to thank you all for pushing me to persist and for providing me with some information to help solve the problems.
What helped was showing me to use the timeout. Once that happened the errors happened in a timely manner. And then I had to discover a few places where I forgot the timeouts. Now it fails and I catch it and keep going.
The one thing I don't know is what is the best way to determine that the current connection has closed and I need to create a newHttpClient. I see in the docs that the HttpClient.connected var is private.
If there is a good way to determined the connection closed status, I would appreciate it.
Again thanks.