This document outlines functional changes in PYLON GA 1.6, slated for release on November 18th, 2015.
Super Public Text Samples
PYLON users need access to the text of stories in order to validate their CSDL filters and tag rules. To meet this need, DataSift will deliver to customers the text of Super Public stories which match their filtering rules. Super Public stories may be displayed in social tech applications subject to DataSift license terms (login required).
Data Collection & Caching
Any Super Public story which matches a CSDL condition in a running recording will be cached for you to retrieve. You do not need to take any action to ensure that Super Public stories matching their filtering criteria are returned for each recording. Super Public data is available from November 11th for all running recordings, and will be available for you to retrieve starting on November 18th. There will be no additional cost to access Super Public text samples.
“Super Public” Definition
Facebook defines stories as Super Public when they meet 3 criteria:
- Posted by someone who has “Who can see your future posts?” set to “Public” under Privacy Settings
- Posted by someone who has the Follow setting enabled, allowing non-friends to see their stories
- Story is not posted to someone else’s Timeline
The Super Public feed is not sampled or reduced upstream of DataSift’s filtering engine, it contains all stories meeting the Super Public definition.
Rate Limiting & Query Filtering
DataSift will deliver up to 100 stories per recording per hour. The rate limiting is at the point of retrieval. Customers can use query filters and time ranges to restrict their samples to stories meeting certain criteria, to ensure the stories they retrieve are relevant to a filter-validation or classification use case. Each story will be delivered only once and is then removed from the delivery queue. The delivery mechanism is a new REST API endpoint; pylon/sample.
Demographic metadata is not available for filtering Super Public data. So primary interaction filtering rules which use targets in the
fb.parent.author.* namespaces will not have Super Public stories indexed based on those specific filtering criteria. Sentiment is also not available for filtering or in the output.
All other metadata including topics, links, hashtags and VEDO tags are both used as filtering criteria for Super Public stories and returned in the output.
Category/Topic and Country/Region Pairs
Facebook Topics exist in a two-tier hierarchy. Every topic belongs to a single overarching category. In versions preceding PYLON 1.6, topics and their categories could only be analyzed separately. Because multiple topics are attached to each story based on the text, even when using analysis filters to restrict results to stories with topics in specific categories, topics on those stories may appear in analysis results which lie outside of the desired categories. To solve this, we are introducing new targets for interaction filtering and analysis, fb.topics.category_name and fb.parent.topics.category_name, which contain pipe-separated pairs of categories and their topics. This allows customers to sort and filter results by their categories for display. Similarly, we have introduced fb.author.country_region and fb.parent.author.country_region to allow filtering and analysis on country/region pairs. These targets are available now and populated on data indexed since November 11th.
Get Recording Metadata Across All Identities
In versions preceding PYLON 1.6, the /pylon/get endpoint returned information only for recordings belonging to the identity performing the request. In order to make it easier for applications to understand usage across identities representing all of their end customers, in 1.6 DataSift will allow customers to retrieve information about recordings from all identities in requests authenticated with their account-level, as opposed to identity-level, DataSift account API key. That key is listed on your account page (login required). Pagination of results will generally be needed due to the volume of the response, and this is supported with cursor specification via the “page” parameter.
In PYLON 1.6, you will be able to receive email notifications to the address to which your PYLON user account is registered when your account reaches 50%, 90% and 100% of daily data volume capacity. To enable these notifications, you can change your notification settings here.
Topic Data Availability for Additional Countries
On November 10th, 2015, 25 new countries and territories were made available to all PYLON customers: Argentina (AR), Bolivia (BO), Brazil (BR), Chile (CL), Colombia (CO), Costa Rica (CR), Cuba (CU), Dominican Republic (DO), Ecuador (EC), El Salvador (SV), French Guiana (GF), Guadeloupe (GP), Guatemala (GT), Haiti (HT), Honduras (HN), Martinique (MQ), Nicaragua (NI), Panama (PA), Paraguay (PY), Peru (PE), Puerto Rico (PR), Saint Barthelemy (BL), Saint Martin (French Part) (MF), Uruguay (UY) and Venezuela (VE).
If you do not wish to have data from these countries flowing into your recording you should ensure that you are using CSDL to restrict data to specific countries of interest. For example, this rule restricts a recording to stories and engagements from people whose Current City listed on their profile is in Spain or Portugal:
fb.author.country_code in "ES, PT" or fb.parent.author.country_code in "ES, PT"
If you do not restrict data in your filters by country, you will receive data from all 82 countries currently available.