3

I have set of hyphenated string sets. That I want to sort considering the locale.

List<String> words = Arrays.asList("App - Small", "Apple", "App - Big");

Collator collator = Collator.getInstance(new Locale("en"));

// Sort Method 1            
Collections.sort(words, String.CASE_INSENSITIVE_ORDER);
System.out.println(words.toString());

// Sort Method 2        
collator.setStrength(Collator.PRIMARY);
Collections.sort(words, collator);
System.out.println(words.toString());

Result

String.CASE_INSENSITIVE_ORDER

[App - Big, App - Small, Apple]  

Collator.PRIMARY

[App - Big, Apple, App - Small]

Though the Collator.PRIMARY is supposed to do a case-insensitive sorting there is difference between the order using the above two methods. How can I achieve locale based case-insensitive sort order that works with hyphen.

[App - Big, App - Small, Apple] - Expected sort order

Daniel Widdis
  • 8,424
  • 13
  • 41
  • 63
aquitted-mind
  • 263
  • 1
  • 13
  • I am using TreeMap so I wrote that initially. Edited to remove that from the subject. Regarding, the sort order that is the expected output from my application. – aquitted-mind Oct 28 '13 at 17:21

2 Answers2

2

There is no case sensitivity issue involved. The collator ignores spaces and hyphens so, since all strings start with “App”, the significant letters are “S”, “l”, and “B” in your example and the resulting order “B” “L” “S” is correct.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • Take a look at http://stackoverflow.com/questions/16567287/java-collation-ignores-space for more info. – Chill Oct 28 '13 at 17:25
1

Below is the quote from the API.

It does not only depend on the strength but also other things. Enclose hyphen('-') between single quotes and you will get the desired output ''

The definitions of the rule elements is as follows:

  • Text-Argument: A text-argument is any sequence of characters, excluding special characters (that is, common whitespace characters [0009-000D, 0020] and rule syntax characters [0021-002F, 003A-0040, 005B-0060, 007B-007E]). If those characters are desired, you can put them in single quotes (e.g. ampersand => '&'). Note that unquoted white space characters are ignored; e.g. b c is treated as bc.

http://docs.oracle.com/javase/7/docs/api/java/text/RuleBasedCollator.html#compare(java.lang.String, java.lang.String)

Abhijith Nagarajan
  • 3,865
  • 18
  • 23