Interaction.content CONTAINS_ANY matches Twitter handles in retweets


#1

I am getting back undesirable Twitter content because interaction.content Contains_Any is matching against Twitter handles.

My term is
interaction.content CONTAINS_ANY "Boov,Smekday"
What is also coming back is anything that is a retweet of the user @jacob_boov.

How do I prevent this guys fan base from polluting my stream?

Thank you!


#2

Another similar case with username, but even more confusing
My search is
interaction.type IN “facebook,twitter” AND
interaction.author.username IN “HowToTrainYourDragon,HTTYDragon” OR
interaction.mentions IN “HowToTrainYourDragon,HTTYDragon”

What is also coming back is anything that is a tweet of @httyd_fishlegs

In both cases, the undesired Twitter handle has an underscore.


#3

In the first case, this is a known issue. We should be stripping these @mentions out of the interaction.content field before filtering (though they will still be filterable in the interaction.raw_content field). We hope to have a fix for this released within a couple of weeks. 

For the second question, I ran an Historics query for the CSDL you provided over the last couple of days, and did not match any Tweets from @httyd_fishlegs. The CSDL you have provided should not be able to match this username in any case. If you were perhaps filtering for something like:

  interaction.author.username contains_any "httyd"

you could certainly have matched Tweets from this user due to how we tokenize text for filtering.


#4

I am using “IN” for the username, and “Contains_Any” for other elements. “httyd”, in some form or another, is in my profile a few times. Here are all of them. I don’t see why @httyd_fishlegs tweets get captured with this.

interaction.type IN “facebook,twitter” AND (
interaction.author.username IN “HowToTrainYourDragon,HTTYDragon” OR
interaction.mentions IN “HowToTrainYourDragon,HTTYDragon” OR
twitter.retweeted.user.screen_name IN “HowToTrainYourDragon,HTTYDragon” OR
twitter.hashtags CONTAINS_ANY “HowToTrainYourDragon2,httyd,httyd2,toothless” OR
twitter.retweet.hashtags CONTAINS_ANY “HowToTrainYourDragon2,httyd,httyd2,toothless” OR
interaction.link CONTAINS_ANY “howtotrainyourdragon,howtotrainyourdragon2,imdb.com/title/tt1646971,httyd” OR
links.url CONTAINS_ANY “howtotrainyourdragon,howtotrainyourdragon2,imdb.com/title/tt1646971,httyd”)


#5

It looks like these interactions are matching on:

interaction.link contains_any "..., httyd"

In the case of Tweets, interaction.link is the URL to the Tweet. This URL contains a Twitter user's username: 

"interaction": {
    "link": "http://twitter.com/httyd_fishlegs/status/449728675131314176"
}
 

#6

I understand that now. Is there a way to just look for the links within a Tweet, and not the URL of the user name of the Tweet?


#7

Use the Links Augmentation. This will only filter on links within the content of the post/Tweet. You could look at filtering on targets such as links.normalized_url.

Here is a full list of Links Augmentation Targets.


#8

Thanks Jason. updating my profiles now.