Getting many 500 errors when calling /pull, what should I do?


#1

I have a process that polls a pull queue every ten seconds.

In that process, it gathers a few subscription IDs, and calls /pull on it.

It then takes the interaction and performs some processing on it.

If it sees there is a X-DataSift-Cursor-Next header with a value, it waits .5 seconds (as per the documentation) and calls again.

It continues to do this until the queue is drained (it’s pulling 20MB at a time, but it never gets that high).

That said, in the past five hours, I’ve had 34 calls fail where it returns a 500 error.

My questions:

  • Is the general logic of cycling through the cursor until there is no next cursor correct? The assumption is that when the queue is drained, the next cursor header will return null.
  • Do I lose data when a 500 is returned?
  • The documentation for response codes (http://dev.datasift.com/docs/rest-api/response-codes) indicates that we should try again. Should we wait the 1/2 second before the next call before trying again?
  • It also says to contact support. I have the times where these failures happened, and can pass those along if need be.

#2

What you are currently doing sounds perfectly sane, and in line with our API guidelines. This 500 error is a known issue - it simply means we are temporarily unable to read data from your queue. We have a release planned to change this 500 to a 503, which should be a little more semantically correct. No data is lost in these cases - we are simply unable to read from the queue. We are putting some significant engineering effort into improving the reliability of our Push Delivery service, with a number of other releases planned for general platform resiliance. 


#3

When I try to create a Pull I get an 404 error with the message {“error”:“Output type “pull” not found”}’.

I created the request with curl:

curl -i -X POST ‘https://api.datasift.com/v1/push/create’ -d ‘name=connectorpull’ -d ‘hash=41b4b025a3774239808b61a8773ecdb6’ -d ‘output_type=pull’ -d ‘output_params.format=json_new_line’ -H ‘Authorization: MyAuthoriziation’

I think the same request worked two days ago.