How much data is available from Japan?


We’re evaluating using datasift these days and one of the requirement is to be able to content in Japanese.
The target sources ( tend to say so, since most of the sites are multilingual, but would like to have confirmation.


With DataSift you can filter for, and receive data in almost every modern language. For example, this stream was created with the following CSDL:

  interaction.content contains_any "日本国, 東京, 日本東京"

We also offer access to the Japanese text-board 2Channel.


I am testing your service mainly for japanese tweet.
But maybe no-space japanese tweet like “現在の東京都知事” cannot be found with CSDL: interaction.content contains “東京, right?
If it is true, it is huge problem for me…


If you are looking for 'phrases' within words or Tweets that contain no spaces, you could use the substr operator:

 interaction.content substr "東京"

Take a look at our documentation to see what kind of Targets and Operators are available to filter with.


Thank you for your reply.
I could get lots of tweets I expected with substr operator.