YouSeer is an open source search engine framework, which was built on top of other open source components. YouSeer utilizes Hereitrix as a crawler and solr as an indexing system. The framework provides software to ingest the documents harvested by Heritrix into solr. The ingesting software is very flexible and allows for user-specific data extraction implementations. Further, YouSeer provides a simple interface to query the index and another interface to retrieve cached versions of the documents.
by
otis
2009-11-19 13:18
crawl
·
index
·
search
·
Heritrix
·
nutch
·
information retrieval