Push/Create sends empty json objects to designated http url


#1

Hi all,

I’m using the python library with django to accomplish this. I basically have a view function that creates a stream using push/create. The code looks like this:

dsUser = datasift.User(settings.DATASIFT[“username”], settings.DATASIFT[“api_key”])
csdl = “twitter.user.id in [”+trackedUsersString+"]"
streamDef = dsUser.create_definition(csdl)

        pushDef = dsUser.create_push_definition()
        pushDef.set_output_type("http")
        pushDef.set_output_param("delivery_frequency", 20)
        pushDef.set_output_param("max_size", 1000000)
        pushDef.set_output_param("format", "json")
        pushDef.set_output_param("url", "http://ec2-***********.us-west-2.compute.amazonaws.com/ct/datasiftLog/")
        
        sub = pushDef.subscribe_definition(streamDef, twitterHandle)

The request goes through successfully, but when I check the log I have in the output url, it only receives empty json objects. I’ve used the same csdl string in the dashboard gui, and I got like 500 tweets in a half hour, whereas I got 0 tweets here. the csdl string looks like this: ‘twitter.user.id in [1123000728,1413858552,1413876422,312362248,1413716588,494192583]’ but with a lot more id’s, and it compiles fine.


#2

After creating your Push definition, can you try making calls to /push/get (see get_push_subscription() method), and after running the subscription for a few minutes, make a call to /push/log (see get_push_subscription_log() method) to see if the Push subscription is experiencing any problems. 


#3

Here is what my log file looks like:

[13/May/2013 18:49:38] DEBUG [ct:47] [13/May/2013 18:49:58] DEBUG [ct:47] [13/May/2013 18:50:18] DEBUG [ct:47] [13/May/2013 18:50:37] DEBUG [ct:65] {u'count': 0, u'log_entries': [], u'success': True} [13/May/2013 18:50:38] DEBUG [ct:47] [13/May/2013 18:50:46] DEBUG [ct:65] {u'count': 0, u'log_entries': [], u'success': True} [13/May/2013 18:50:58] DEBUG [ct:47]

As you can see, it’s empty - just like the QueryDict objects being received. I’m calling the method as follows:

logs = dsUser.get_push_subscription_log()


#4

I’ve been playing around in the console as well, and the same problem is occurring. I’ve added my own account to the list of users I’m tracking, and no tweets are being logged, and doing a log api call returns even after several minutes (the delivery frequency is set to ten minutes:

{
“success”: true,
“count”: 0,
“log_entries”: []
}

The URL I’ve set to receive the push notifications is receiving empty push calls, to which it’s responding with a { success : True } object.

Please help me. This is very irritating, and it’s getting in the way of a bunch of my deadlines. I’ve spent countless hours trying to resolve this one silly bug.


#5

Could you please raise this at support.datasift.com, including your DataSift username, and any Push Subscription IDs you are using. There have been no issues with Push Delviery over the last few days, which suggests there might be an issue with your account or configuration.


#6

Did this issue get resolved? I’m trying to use your console https://console.datasift.com/datasift and have a similar issue where the data isnt getting returned from the push/create call

My CSDL - interaction.content contains “Test tweet most recent 3” and my subscription id = a17bf803bd34c07fe18953e0696eb4d2


#7

I’m experiencing the same problem and I have a theory as to why.

As soon as you edit the code of a stream the hash of that stream changes. But a subscription (push/create) is pointed to a hash, not a static ID of a stream. This is the part of the design I don’t understand. Because this means that as soon as you edit the code of a stream, you need to delete the old subscription (to the old hash) and start a new one (to the new hash).

I might have misunderstood this, because it seems insanely messy :slight_smile:

And if you aggregate your streams in to one stream (as shown in http://dev.datasift.com/docs/advanced/stream ), then it gets even messier because you have to :

  1. Edit the underlaying stream
  2. Edit the aggregating stream
  3. Delete the old subscription
  4. Create a new subscription

I just want to be able to edit the code of the stream and let it flow to my server with the updated code.

Please advise if there’s an easier way to do this.


#8

This workflow is by design. In most production environments, DataSift's users will manage almost everything via the API, where there is no concept of a 'editing a stream' as there is within the UI. 

So if I compile 'twitter.text contains "coffee"', and 'twitter.text contains "coffees"', these should be treated as two totally separate and independent streams. Behind the scenes of the UI, the same process happens. If you create a stream, asve it, then edit it, you are still compiling a new and unique piece of CSDL. These different 'versions' of a CSDL definition are only tied together in the UI so you can see how you have changed your CSDL over time.


#9

Hi Jason,

Thanks for the very fast response, I appreciate that a lot.

We’re a CRM. So, what we want to do is monitor several Twitter accounts’ mentions, outgoing tweets and retweets of their tweets. We will be adding and removing several Twitter accounts every day.

Two questions:

  1. To manage this programmatically, what would be the best practice for setting this up?
    (I’m right now trying out two different solutions, see examples below)
  2. Is there a way to be able to update the stream but not deleting and creating a new subscription to it?

A:

tag “twitterAccount1” {twitter.user.name in “twitterAccount1” OR twitter.mentions in “twitterAccount1” OR twitter.retweeted.user.name in “twitterAccount1”}
tag “twitterAccount2” {twitter.user.name in “twitterAccount2” OR twitter.mentions in “twitterAccount2” OR twitter.retweeted.user.name in “twitterAccount2”}
return {
twitter.user.name in “twitterAccount1” OR twitter.mentions in “twitterAccount1” OR twitter.retweeted.user.name in "twitterAccount1"
OR twitter.user.name in “twitterAccount2” OR twitter.mentions in “twitterAccount2” OR twitter.retweeted.user.name in “twitterAccount2”
}

B:

tag “twitterAccount1” {stream “hash1”}
tag “twitterAccount2” {stream “hash2”}
return {
stream “hash1” or
stream “hash2”
}

Where hash1 is:

twitter.user.name in “twitterAccount1” OR twitter.mentions in “twitterAccount1” OR twitter.retweeted.user.name in “twitterAccount1”


#10

1) To manage this programmatically, what would be the best practice for setting this up?
Both methods you described above would work - it is really down to personal preference. Personally, I would use method A, as it would require fewer /compile API calls, though if you may want to use these sub-streams in other streams in the future, you may want to opt for option B.

You should look at optimizing your CSDL in this example. Check out the "Overusing Operators" and "Don't Repeat Yourself" sections on our CSDL Optimization Techniques blog post for more info.

Note: you may want to have a look at some of our Interaction Targets - they allow you to filter on multiple targets at once, using just one normalized target (saving you DPUs).

2) Is there a way to be able to update the stream but not deleting and creating a new subscription to it?

Not at present, though we do have features on the way which will make this process a little simpler.