Encodings in java library


#1

I have been looking at the HttpThread class in the java client library and I see:

reader = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));

I cannot see any handling of encodings here. Will this really work if executed on a JVM
where the default encoding is different from the encoding that your servers use? Do you
set the encoding in your response headers? If not, what encoding do you use? UTF-8?
Even for tweets written in chinese/arabic/farsi/russian?

L


#2

All our responses through the API are returned in UTF-8 formatted JSON - if receiving Tweets written in languages such as Chinese or Arabic, you will simply need to run these Tweets through a JSON library to decode them.

We do specify the encoding in our response headers. Here is an example;

 

* Connected to stream.datasift.com
> GET /<stream_hash> HTTP/1.1
> User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
> Host: stream.datasift.com
> Accept: */*
> Auth: <username>:<api_key>
< HTTP/1.1 200 OK
< Connection: keep-alive
< Content-Type: application/json; charset=utf-8
< Transfer-Encoding: chunked