I have to ask you, in what way the number of obtained tweets is calculated. We have an application which opens connection with Streaming, and then reads the response line by line (in fact tweet by tweet). Everything works fine except DataSift stats visible on the account’s dashboard, which are much higher than total amount of data we have got from streaming. For example today we were reading the stream for about 3h and at this time we have exceeded the limit of 500,000 tweets per day, but in fact we’ve read less than 10% of this value.
We use very simple filter related to the tweet’s and retweet’s text. How are these stats determined by DataSift? Or maybe our approach is not correct?
We hope for your quick response.
Could you share the CSDL you were using? We determine this value by the number of interactions that match your filter, so it is most likely your CSDL is simply a very borad filter, which picked up 500K interactions in this time.
In some cases where you receive a huge amount of data from DataSift in a short space of time, your server may not be able to keep up with the volume, and may start dropping interactions.
Thanks for your response! I are right, it might be because of the too broad CSDL filter. Huh, thank you once again.