1

Elasticsearch 1.7

We would like to test Kuromoji with Unidic on Elasticsearch. Compiling kuromoji gives me a few jars with different dictinaries.

Is there a simple way to replace the ipadic-based-kuromoji with the unidic-based-kuromoji?

Thanks.

tokosh
  • 1,772
  • 3
  • 20
  • 37
  • Is the situation any more insightful with Elasticsearch 2.2? E.g., https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-tokenizer.html ? – Ahmed Fasih Mar 29 '16 at 02:11

2 Answers2

0

I ended up using the cmecab-java project as a guide to implement an Elasticsearch wrapper to use unidic-kuromoji (from here). There are older commits from the cmecab-java project which contain lucene plugin wrappers which need to be adjusted for an Elasticsearch-plugin.

tokosh
  • 1,772
  • 3
  • 20
  • 37
-1

tokosh

I think it is no simple way to replace.

See : https://issues.apache.org/jira/browse/LUCENE-4056 Lucene Kuromoji still has open issue about it.