I have browsed the datasift and twitter developer sites and have not been able to find any documentation on which parts of a tweet that are guaranteed to be present and which parts that may sometimes be left out of tweets. I understand that parts such as geo and place may be left out, but how about parts such as twitter.id, twitter.user.id_str, twitter.user.name, twitter.user.screen_name? Do you have such documentation anywhere? If not, do you know if twitter has such documentation anywhere? Is the only safe option to assume that nothing is guaranteed to be present?
Good question. Due to differences between Tweets and Retweets, no Twitter fields can be guaranteed to be present in an interaction. For example;
['twitter']['links'] or ['twitter']['geo'] will only be present if links or geo data exist.
['twitter']['user']['id'] will be present if you send an original Tweet, however it will be replaced with ['twitter']['retweet']['user']['id'] if you are posting a Retweet.
If we were to write code that only looks at tweets, would then all fields twitter.user.* always be present? Or conversely, if we were to restrict ourselves to retweets only, would then all fields twitter.retweet.user.* and/or twitter.retweeted.user.* always be present? I am especially interested in the name, screen name, and id_str fields? Are these always set in the user part, wherever in the json-tree the user part appears?
As I'm sure you already know, there are three types of Twitter user objects we return;
The following user fields will always exist in each interaction;