How would I configure this setup?


#1

Hi

Completely new to DataSift and the API. We are trying to track the audience sentiment for 3 speakers on stage but I’m not sure how to architect the stream(s). This is our spec

  1. We have a @mytwitteraccount and a hashtag for each of the speakers.
  2. The audience can twitter in the following way @ourtwitteraccount #speaker1 YES or @mytwitter account #speaker NO depending on whether they agree or disagree with what the speaker is saying at that moment in time – it’s a round-table debate. We then compare the speakers “score” against the current “scores” of the other speakers to see what percentage of positive votes each of them has.

How would I formulate such as query or is there some documentation I should focus on? Would I need 3 x streams (1 for each speaker) or can this be consolidate into a single stream with the JSON returning the current “score” for each speaker in its set in some way?

Many thanks for any input!
Ben


#2

This is easily achieveable with a single stream:

 

This stream looks for any Tweets to the named Twitter account which contain one of the listed hashtags, and tags each Tweet depending on which speaker it mentions, and whether the Tweeter said YES or NO. In your returned JSON interaction, you can take a look at the tags element, which will list the speaker, and the YES or NO reaction to them, for example:

"tags":["Contestant1","YES"]

Please feel free to duplicate and modify this stream as necessary.


#3

Hi Jason

Thanks great - thanks. Would this only return the data from the point we evoke the call to get the stream? We need to reset the scores for each question.

Thanks
Ben


#4

The best way to reset the scores would most likely be to stop and restart the stream between questions. Alternatively you could set up several differnet instances of this one stream, and run Instance1 for Question1, Instance2 for Question2.... just resetting your display each time you ask a new question.


#5

Hi

Sorry to bang on with this thread! I’m still a little confused as to what to do with my requirement and whether I should sample the data (REST) or process a stream. We are expecting in the region of 200-300k of people to be Tweeting in their “votes” - some people will vote multiple times (the broadcast is 1 hours long). But the in-studio conversation will be moving quickly and the sentiment may quickly shift. If I’m going down the stream path I’m concerned I won’t be able to process the data in time to reliably visualise the current voting state (I’m not worried about accounting for every vote but I want to broadly capture the current sentiment). Could I find myself processing queued votes whilst the conversation on stage has moved somewhere else? Our graphical display of “votes” may be completely out of sync with the conversation.

Is my thinking vaguely correct here? I thought I could possibly use your RESTful API instead but am I right in thinking it’ll only return about 20 object every 10 seconds?

Thanks
Ben


#6

Hi Ben,

Jason's out of the office so I'm stepping in to cover in the forums.

I recommend the streaming API which runs in real time. 

Keep the stream open for the duration of the show.

Write your client software so that you can reset your statistics whenever you need to. The streaming API is certainly fast enough to give you the performance you're looking for.

I hope this helps but please get back to me if you need to.

Ed