Case-sensitivity-turkish-non-ascii-letters


#1

This is linked to:

http://dev.datasift.com/discussions/case-sensitivity-non-ascii-letters
and:
http://dev.datasift.com/issues/closed/datasift-does-not-currently-associate-upper-and-lower-case-versions-same-accented

It seems that this issue has been resolved for some characters, but not all:

I added this line and checked both uppercase and lowercase scenarios :

interaction.content contains_any "kişi " AND interaction.content contains_any “ertuğ, ertug”

If we tweet like

xxxxxx kişi xxxxxx ertuğ

It captures without a problem, but it fails when we tweet like this :

xxxxxx KİŞİ xxxxxx ERTUĞ

Likewise it fails when we tweet

xxxxxx kişi xxxxxx ERTUĞ

Oppositely it captures same tweet if we write capital english ” I “ instead of turkish “ İ “ :

xxxxxx KIŞI xxxxxx ERTUĞ

We concluded that it is capable of comparing strings even if they are capital or not. But it can not convert Turkish letters to their uppercase equivalences on comparison operation.

It seems that the bug mentioned above still exists for some Turkish characters. Would you agree?