Semantic Vector indexes, created by applying a Random Projection algorithm to term-document matrices created using Apache Lucene. The package creates a WordSpace model, of the kind developed by Stanford University's Infomap Project and other researchers during the 1990s and early 2000s. Such models are designed to represent words and documents in terms of underlying concepts, and as such can be used for many semantic (concept-aware) matching tasks such as automatic thesaurus generation, knowledge representation, and concept matching. The Semantic Vectors package uses a Random Projection algorithm, a form of automatic semantic analysis, similar to Latent Semantic Analysis (LSA) and its variants like Probabilistic Latent Semantic Analysis (PLSA).
by
otis
2008-01-14 02:06
semantic
·
LSA
·
PLSA
·
NLP
·
information retrieval
·
java
·
api