Unknown Exceptions in C# Client Library


#1

At 17:15 yesterday, my streaming application started receiving the following message:

{
“status”: “failure”,
“message”: “This streaming API service node is currently unavailable, please reconnect immediately.”
}

The DataSift client returned (Datasift\DatasiftStream\DatasiftStream.cs line 223): “Unkown exception:
Unable to read data from the transport connection: The connection was closed.”. It then stopped the stream. As the streamer needs to run 24/7, it needs to retry (using the reconnect interval rules) whatever error occurs and not terminate.

In my application, if the stream stops, it will be restarted. This should never occur as the DataSift client should deal with this (it was only put in place in the event that ‘MaxRetries’ was exceeded). Because the stream stopped, it was restarted, but the same error occurred immediately (the DataSift client is not using the reconnect interval rules in this scenario).

The restart / unknown error cycle continued until an OutOfMemoryException was thrown. The server does have plenty of ram (24GB) so this suggests that the DataSift client is not releasing resources.

Please can you look at implementing retries for all stream outcomes and check resources are being released.

Many thanks,

Gareth


#2

Thanks for spotting this issue. We have updated our C# Client Library to fix this reconnection issue.


#3

Thanks for the prompt update.

I’ve noticed a couple of potential problems that I though I’d run by you.

Firstly, the DatasiftStream object has a field, connectCount, which is incremented on failure up to MaxRetries. I can’t see anywhere that it is reset on success, meaning that max retries will eventually be exceeded even if the failures are spread over an extended period.

The other issue is that it looks as though the stream is still stopped if “Unable to resolve the Datasift domain name. A possible cause is the local connection to the internet” or “The connection to the DatasiftStream could not be established!”. It still needs to retry in these scenarios as they are likely to be short lived problems.


#4

Again, thank you for pointing this out - our C# Client Library has been updated to reflect these changes.


#5

Thanks for these changes. I’ve done some testing of exception handling (by turning network connections on/off). I have MaxRetries set to 20.

It looks as though the reconnect rules (http://dev.datasift.com/docs/streaming-api/reconnecting) have not been fully implemented. While an exponential delay has been implemented, the linear delay has not. (The field ‘linearConnectTimeoutLength’ has been defined but never used).

Unfortunately there is a problem with the expontential delay implementation. The first thing is that ‘exponentialConnectTimeoutLength’ is not reset after a successful connection. This means that for subsequent failures, the length of delay will continue where previous failures left off.

Also, the reconnect rules suggest a maximum delay of 320 seconds, but no maximum has been implemented. This means that by the 10th retry, the delay is 42 minutes, and by the 20th retry, it is is 60 days!

As stated previously, the streamer needs to be streaming or trying to stream 24/7. I cannot have the streamer stop because max retries has been exceeded. I could set retries to a huge value (e.g. int.MaxValue), but a ‘limitless retries’ option would be preferable.

PS. It would be easier to read this post if line breaks were preserved.


#6

The linear backoff has now been implemented in the C# client library - https://github.com/datasift/datasift-csharp


#7

Thanks for this. I’ve applied this update and retested, as before, by turning network connection on/off to simulate service outages. What I’m now finding is that the streamer crashes a short period (6 mins) after the outage.

The exception message is: “Thread is running or terminated; it cannot restart.”

Stack Trace: at System.Threading.Thread.StartInternal(IPrincipal principal, StackCrawlMark& stackMark)
at System.Threading.Thread.Start()
at Datasift.DatasiftStream.DatasiftStream.Consume()
at Datasift.DatasiftStream.DatasiftStream.StopOrRetry(String msg)
at Datasift.DatasiftStream.DatasiftStream.ImediateRetry(String msg)
at Datasift.DatasiftStream.DatasiftStream.StartStreaming()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart().

It looks to me as though multiple threads are running.


#8

We will look into the exception shortly. 

Regarding the multiple threads - did you try stopping and restarting your stream at all before you ran your service outage tests? This may be a case of not disconnecting properly before trying to reconnect.


#9

Looks like the multiple threads may have been a bug too - apologies for this. A fix has been applied and pushed to Github