1

If I want to create a dictionary where the user can create a custom alphabet (that still uses unicode) Is there a way to change lowercase and uppercase mapping of the characters?

Let's say I want the lowercase of 'I' to be 'ı' instead of 'i' or upperCase of 'b' to be 'P' instead of 'B' so that System.out.println("PAI".toLowerCase()); would write baı to the console.

I suppose I can create a method toLowerCase(String s) that first replaces "P" with "b"s then converts to lowercase but wouldn't that be slower when searching through a dictionary of hundreds of thousands of words?

WVrock
  • 1,725
  • 3
  • 22
  • 30
  • My gut feeling is that it would be *faster* than searching through a dictionary of hundreds of thousands of words... – aioobe Jun 09 '15 at 14:39
  • The method I'm talking about will first replace then use toLowerCase() how can it be faster? If you have figured out a faster way please explain. – WVrock Jun 09 '15 at 14:45

4 Answers4

1

The toLowerCase(String s) uses the locale to decide how to convert the characters, you should have to define your own locale and then, for example, load it as the default locale via Locale.setDefault(Locale) before executing the toLowerCase(String s)

Paizo
  • 3,986
  • 30
  • 45
1

No, it would not be slower because you are simply traversing through the array and not modifying the position of any object which would result in O(n). Performance wouldn't be affected, and any system should be able to handle a single conversion and then toLowerCase call easily.

You could also override the toLowerCase(String s) function to accommodate your needs. Even simpler!

Toby Caulk
  • 266
  • 1
  • 6
  • 21
  • 1
    You can't override `toLowerCase` since `String` class is `final`. – Hummeling Engineering BV Jun 09 '15 at 14:53
  • I need more elaboration on this. In the situation where I have 10 custom mapping I would need to call `string.replaceAll(Capital, lower);` 10 times for each word. How would this not slow down – WVrock Jun 09 '15 at 14:53
  • @HummelingEngineering we cannot override `String.toLowerCase()` but we can override a custom made `toLowerCase(String s)` method – WVrock Jun 09 '15 at 14:54
  • True, I forgot about that. The above answer mentions creating your own implementation of a String class, that is the better answer. – Toby Caulk Jun 09 '15 at 14:55
  • @WVrock Yes you would, but that is still only traversing an array and not modifying the positions of any data in that array. You're simply performing maybe two actions on a character and then moving on to the next, so you would not see a performance hit. Sure, the time to complete the conversions will scale with the size of the dictionary provided, but that's not a huge issue and there's not way around that. – Toby Caulk Jun 09 '15 at 14:57
  • @WVrock agreed, when you've defined a new class with that functionality. Iterating over the string and replacing characters using a predefined `HashMap` would be my solution. – Hummeling Engineering BV Jun 09 '15 at 15:00
  • @WVrock Having a map would make your life easier, but my point still stands; you will most likely be able to perform your needed actions on a set of thousands of words in a blink of an eye. CPUs are quite fast, this won't bog one down. – Toby Caulk Jun 09 '15 at 15:15
  • @TobyCaulk it seems that you are right "PAI".toLowerCase() and HummelingEngiineering's MyString.toLowerCase() 200_000 times took less than 100 millisecs. – WVrock Jun 09 '15 at 15:34
  • Ah, but the real test would be to throw a couple hundred letters at the program and see your performance then. If my math is correct, then 100 letters at 200,000 cycles would take 10 seconds (or 10,000 milliseconds). Still pretty damn fast, but if you use a profiler you can probably see what's bogging it down and possibly squeeze more performance out of it! – Toby Caulk Jun 09 '15 at 15:39
  • A string with 100 "P" took less then 400ms using the @HummelingEngineering 's class. – WVrock Jun 09 '15 at 15:50
  • @WVrock I was doing my math with all three characters being mapped to new values. Oops! Anyway, glad you solved your problem! – Toby Caulk Jun 09 '15 at 15:52
0

Check this Answer you cannot inherits from String Class because its final, but you could create your class with your toLowerCase Method, I suggest you called diferents for maintenance.

And for the dictionary of hundreds of thousands of words.... Maybe you use a Map or HashMap with the key will be the string enter by the user and in the object you maybe save automatically the value in lowerCase, it depends of what you need.

But for get better performance I could recommend save the value in Database

Regards.

Community
  • 1
  • 1
  • Maps for keys wouldn't work.I want half written words to find the closest match. Like "scho" finding "school". That answer, I'm afraid, is beyond my java knowledge. Though I might try it some day. – WVrock Jun 09 '15 at 15:06
  • It seems I've read the wrong answer. Using a string wrapper would be no different (in performance) than creating a toLowerCase(String s) method. – WVrock Jun 09 '15 at 15:11
  • So, for that functionallity, I Suggest you try to implements the logic with Database Layer, because you coult attach an even listener when the user are typing to go to the database and did the search WHERE field like 'User_Input_%' – jorge polanco Jun 09 '15 at 15:27
0

This should do the trick:

import java.util.HashMap;
import java.util.Map;

class MyString {

    String string;
    static final Map<Character, Character> toLowerCaseMap, toUpperCaseMap;

    static {
        toLowerCaseMap = new HashMap<>();
        toLowerCaseMap.put('I', '|');

        toUpperCaseMap = new HashMap<>();
        toUpperCaseMap.put('b', 'P');
    }

    MyString(String string) {

        this.string = string;
    }

    String toLowerCase() {

        char[] chars = string.toCharArray();

        for (int i = 0; i < chars.length; i++) {
            char c = chars[i];
            chars[i] = toLowerCaseMap.containsKey(c) ? toLowerCaseMap.get(c) : Character.toLowerCase(c);
        }

        return new String(chars);
    }

    String toUpperCase() {

        char[] chars = string.toCharArray();

        for (int i = 0; i < chars.length; i++) {
            char c = chars[i];
            chars[i] = toUpperCaseMap.containsKey(c) ? toUpperCaseMap.get(c) : Character.toUpperCase(c);
        }

        return new String(chars);
    }
}