1

I wanted code to accept all 7 bit ascii character set but not to accept 8bit characters. I have tried with regular expression:

user.getFirstName()).matches("[\\w\\s]+") 
Duncan Jones
  • 67,400
  • 29
  • 193
  • 254
  • You won't succed with regexp because regexp have nothing to do with ascii encoding. You should convert your character to a byte array (specifying the encding you want) and check this byte array. – Arnaud Denoyelle Jul 15 '13 at 11:39
  • You could go over the String and look at each character to see if it has a codepoint < 128. – Thilo Jul 15 '13 at 11:39

2 Answers2

4

There is a Java Regular Expressions class for this set. It is \p{ASCII}. See Pattern class.

 "ABC".matches("\\p{ASCII}+") == true;
 "ABCŻ".matches("\\p{ASCII}+") == false;
Grzegorz Żur
  • 47,257
  • 14
  • 109
  • 105
  • +1 for pointing this out - although i have worked with Java regexes sometimes, i just realised that i missed a lot of its many options! Looks like Java is the better Perl :-) – Gyro Gearless Jul 15 '13 at 11:55
  • 1
    "Looks like Java is the better Perl" How can you say that in the face of the double-backslash? ;-) – Thilo Jul 15 '13 at 11:59
  • @PaulGorbas It is [matches](http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#matches(java.lang.String)). – Grzegorz Żur Apr 09 '15 at 19:12
  • Yep - MATCH is JavaScript and MATCHES is JAVA, sorry missed the language in a false search result. – Paul Gorbas Apr 09 '15 at 20:15
3

There is the '\x' way of entering numbers hexadecimally: (Source http://www.regular-expressions.info/reference.html )

yourString.matches("[\\x00-\\x7F]+");

In Java this might be:

yourString.matches("[\\u0000-\\u007F]+");
Simon Forsberg
  • 13,086
  • 10
  • 64
  • 108
  • One backslash, not two, suffices. One may use such a notation everywhere in a java source, for instance to have nice Javadoc with a math formula, copyright symbol or whatever. – Joop Eggen Jul 15 '13 at 11:50