Filter only retweets in real time


#1

Is there a way that the query builder will allow me to build a stream to fetch tweets which are retweeted? I want this to happen in real time.
Should this CSDL work:

twitter.mentions exists

OR

twitter.retweet.text exists


#2

That CSDL will match any retweets, or any Tweets which @mention a Twitter user.

We do advise against using DataSift for this kind of activity. Please bear in mind that this will return well over 1000 Tweets per second - this will cost a considerable amount in license fees, it will likely push you over your 500K Tweets / day Twitter Rate Limit, and it is unlikely you will be able to consume that kind of volume of data without dropping interactions.


#3

Thanks Jason for your reply.
So here is my situation:
I need real-time Twitter stream access for an academic project which will run for a few months. The kind of project that we are aiming for will need to collect millions of tweets each day for this 2 to 3 month period. Based on your response, do you mean to say that using datasift I cannot go over the 500K tweet collection limit for real-time data collection? My current project is a big data project and unfortunately 500K limit is too low a restriction. Is this limit true for all datasift APIs? I am wondering if there is any work-around to get near real-time data – say using Push API where I specify the time interval for push as 1 min.


#4

The basic Twitter rate limit is 500K Tweets per day. You can apply to have this limit rasied if you are on our Professional or Enterprise pricing plan.

To receive your data closer to real-time, we would recommend you set your Push delivery_frequency to 0 for continuous delivery, and your max_size to 20MB. This will ensure we immediately try to send new data after successfully sending a payload.