11

Is there any language recognition open-source for Java? Found only for c/c++.

UPD:

I`m talking about human text language. Example:

Input: My name is John. Output: English.

Input: Ich heisse John. Output: German.

Input: Меня зовут Джон. Output: Russian.

Artem Oboturov
  • 4,344
  • 2
  • 30
  • 48
Yurish
  • 1,307
  • 1
  • 25
  • 44
  • Please tell us what sort of software you want. Should it be a formal automaton, recognizing whether a string is in a particular formal language? Should it tell what human language a text is in? Tell what language some source code is written in? Tell what language some executable might have been written in? Recognize whether sounds are words or just noises? Recognize what language people are talking in? – David Thornley Feb 22 '10 at 16:40
  • Bit picky, but had to -1 since no research effort is shown... it is a good question though, so I favourited it. – icedwater Jul 25 '13 at 05:02

4 Answers4

13

See what you think of the version in Apache Tika. This assumes that you want to find out what language text is in, as opposed to wanting to build a parser for a programming language.

bmargulies
  • 97,814
  • 39
  • 186
  • 310
3

Textcat http://textcat.sourceforge.net/ doesn't have Russian but it does handle the following:

  • albanian
  • danish
  • dutch
  • english
  • finnish
  • french
  • german
  • hungarian
  • italian
  • norwegian
  • polish
  • slovakian
  • slovenian
  • spanish
  • swedish
Paul Gregoire
  • 9,715
  • 11
  • 67
  • 131
1

There is Language Detection API which accepts text via HTTP POST and returns JSON with detected languages and scores. It can be used from Java or any other programming language.

Laurynas
  • 3,829
  • 2
  • 32
  • 22
0

I think ANTLR is pretty much standard.

Bozho
  • 588,226
  • 146
  • 1,060
  • 1,140
  • 2
    One of us is confused. I thought he wanted a way to tell if text was in Chinese or Japanese, and you think he wants to make a parser! We'll see. – bmargulies Feb 22 '10 at 13:15
  • 1
    @bmargulies - it could not be inferred from the question, so both answers make sense. – Bozho Feb 22 '10 at 13:21