0

I am doing project In which I need to ranked text document according to search query like search engine but I need to rank documents having semantic similarity of the word or sentence,I am unable to start regarding how to find semantic similarity using java. Is there any link or any paper through which I can start finding semantic similarity of words in documents or any idea.

ashkan
  • 49
  • 8

2 Answers2

0

The standard way to represent documents in term-space is to treat the terms as mutually orthogonal or independent of each other, e.g. the terms "atomic" and "nuclear" although being synonymous and hence interchangeable are treated as distinct, whereas the semantic similarity between this pair of words should be fairly high.

Thus, for implementing a semantic similarity based score, you need to know the relation between a pair of words, for which you can use either of the following.

  • An external resource such as a Wordnet or a semantic similarity library such as DISCO.
  • A corpus analysis methodology such as Latent Semantic Analysis (LSA) which reduces the dimensionality of the term space by combining semantically similar terms such as "atomic" and "nuclear".
Debasis
  • 3,680
  • 1
  • 20
  • 23
0

Have a look at this Demo for semantic similarity

It shows the demo for different algorithms. you can see which one works for you and try to go with it. Also the this "semilar" module can be used with the help of Java I think. You can try using it, I didnt tried it yet but the demo is for the same on that page. Thanks :)

Gunjan
  • 2,775
  • 27
  • 30