Is there a way to estimate the cost of a stream per month?


#1

How can I estimate the cost of a stream per month? For example…

http://datasift.com/stream/14631/democrats-vs-republicans#app1-preview


#2

We do not currently have any automated methods of predicting the cost of a stream.

With a stream such as the one in question, it is just a case of taking a sample, and doing the math to see how expensive it will be.

The minimum operating cost for DataSift is $0.20 per hour (1 DPU or less is currently charged at $0.20), plus the cost of Tweets received (which is currently $1 for 10,000 Tweets). 

If you expect to receive 1,000,000 Tweets in total for this stream , the Tweets in total would cost $100, plus $0.20 x 720 hours (30 days) = $144.

So it would cost $244 to run for 30 days uninterrupted, if you were to receive 1,000,000 Tweets. (I have not taken a sample for the number of Tweets expected in this stream - I just picked the number 1,000,000).

It can be difficult to estimate the number of Tweets you will receive through a stream, as it is not easy to predict if certain subjects or keywords will begin to trend, and product thousands more results than usual.


#3

Thanks Jason. Twitter activity is unpredictable. On the GUI console however, when you’re looking at a sampling of the stream of tweets, is there a possibility to see how many actual tweets are getting processed? That could allow us to avoid getting throttled for an hour by immediately pausing high throughput CSDL definitions and give some indication of the current volume as a basis to postulate the potential licensing costs.


#4

It is not currently possible to use the GUI to estimate how many Tweets are being processed. If the stream throughput is low enough, you can gain some insight into number of interactions your stream will receive. However on high throughput streams, interactions are written to a buffer before they are pushed to your screen, so you may be receiving more interactions than it appears.