links · people · groups · tags | My: links · tags · groups · watchlists · notes login · sign up now! | help · blog
Simpy simpy
 
Search Everyone: "information retrieval",

Top "information retrieval" experts: otis, j_h_scheufen, cpaulse, glukac, mthomure, paulovn,

Groups about "information retrieval": IR NLP ML and CL, Lucene & Solr,

1 - 50 of 94 next »   Watch otis
 
YouSeer is an open source search engine framework, which was built on top of other open source components. YouSeer utilizes Hereitrix as a crawler and solr as an indexing system. The framework provides software to ingest the documents harvested by Heritrix into solr. The ingesting software is very flexible and allows for user-specific data extraction implementations. Further, YouSeer provides a simple interface to query the index and another interface to retrieve cached versions of the documents.
by otis 2009-11-19 13:18 crawl · index · search · Heritrix · nutch · information retrieval
http://youseer.sourceforge.net/ - cached - mail it - history
by otis 2009-11-18 17:36 NLP · information retrieval · semantic · gate · software
http://www.semanticsoftware.info/ - cached - mail it - history
MuNPEx is a multi-lingual noun phrase (NP) extraction component developed for the GATE architecture, implemented in JAPE. It currently supports English, German, French, and Spanish (in beta). MuNPEx requires a part-of-speech (POS) tagger to work and can additionally use detected named entities (NEs) to improve chunking performance. Please read the documentation (or source code) for more details.
by otis 2009-11-18 17:30 NLP · information retrieval · key phrases · information extraction · computational linguistics · software · gate
http://www.semanticsoftware.info/munpex - cached - mail it - history
by otis 2009-11-02 13:59 software · information retrieval · NLP · perl · corpus · text mining · dataset
http://www.drni.de/wac-tk/ - cached - mail it - history
C++ (but has Java API), GPL
by otis 2009-11-02 13:51 information retrieval · NLP · software · library · api
http://www.lsi.upc.edu/~nlp/freeling/ - cached - mail it - history
by otis 2009-11-02 13:33 NLP · information retrieval · computational linguistics · java · software · api · library
http://herd.ida.liu.se:8180/nlpfarm/ - cached - mail it - history
Platform where anyone can share and mash open data on any subject
by otis 2009-10-29 12:24 data · information retrieval · NLP · machine learning · mashup
http://www.factual.com/ - cached - mail it - history
Common English misspellings from Wikipedia 4107 misspellings as of 2009-10-29
by otis 2009-10-29 12:20 wikipedia · spell · english · language · search · information retrieval · NLP
http://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines - cached - mail it - history
Wunder's progressive reranking explanation
by otis 2009-10-22 12:41 search · information retrieval · rank · score
http://wunderwood.org/most_casual_observer/2007/04/progressive_reranking.html - cached - mail it - history
Sen is the first opensource morphological analyzer written in pure Java.
by otis 2009-10-16 23:39 japanese · morphology · analysis · lucene · search · index · information retrieval · NLP · library
https://sen.dev.java.net/ - cached - mail it - history
OPUS is an attempt to collect translated texts from the web, to convert and align the entire collection, to add linguistic annotation, and to provide the community with a publicly available parallel corpus. OPUS is based on open source products and is also delivered as an open source package.
by otis 2009-10-07 10:54 corpus · information retrieval · NLP
http://urd.let.rug.nl/tiedeman/OPUS/ - cached - mail it - history
Galago is a toolkit for experimenting with text search. It is based on small, pluggable components that are easy to replace and change, both during indexing and during retrieval. It includes TupleFlow, which is a distributed computation framework like MapReduce or Dryad. TupleFlow manages the difficult parts of processing text: serializing data, sorting it, and distributing processing. The IndexReader and IndexWriter classes manage storing key/value pairs like inverted lists. This makes it possible to make your own kinds of index structures without starting from scratch.
by otis 2009-08-12 16:01 java · software · search · library · information retrieval · distributed computing
http://www.galagosearch.org/ - cached - mail it - history
Ivory is a Hadoop toolkit for Web-scale information retrieval research that features a retrieval engine based on Markov Random Fields
by otis 2009-08-12 15:56 hadoop · MapReduce · information retrieval · search
http://www.umiacs.umd.edu/~jimmylin/ivory/docs/index.html - cached - mail it - history
by otis 2009-06-23 23:49 perl · wordnet · similar · software · information retrieval · NLP
http://wn-similarity.sourceforge.net/ - cached - mail it - history
Default dictionary break iterator for Chinese, Japanese, Korean
by otis 2009-06-03 00:15 CJK · japan · chinese · korean · computational linguistics · NLP · information retrieval · search · analysis · word segmentation
http://bugs.icu-project.org/trac/ticket/2229 - cached - mail it - history
by otis 2009-05-28 23:40 chinese · dictionary · information retrieval · search
http://www.mdbg.net/chindict/chindict.php?page=cc-cedict - cached - mail it - history
by otis 2009-05-28 14:34 search · software · python · django · lucene · solr · information retrieval
http://haystacksearch.org/ - cached - mail it - history
by otis 2009-05-28 14:27 solr · ruby · ruby on rails · search · information retrieval
http://outoftime.github.com/sunspot/ - cached - mail it - history
by otis 2009-05-24 21:31 rdf · solr · software · java · search · information retrieval
http://fgiasson.com/blog/index.php/2009/04/29/rdf-aggregates-and-full-text-search-on-steroids-with-solr/ - cached - mail it - history
by otis 2009-05-18 14:07 chinese · dictionary · english · word · word segmentation · NLP · information retrieval · computational linguistics
http://usa.mdbg.net/chindict/chindict.php?page=cc-cedict - cached - mail it - history
by otis 2009-05-17 22:15 web · index · crawl · dataset · corpus · linguistics · computational linguistics · NLP · information retrieval
http://webascorpus.org/ - cached - mail it - history
WordnetAPI is a Java interface to the famous WordNet database of lexical relationships.
by otis 2009-05-15 10:07 wordnet · morphology · lexical · synonyms · NLP · information retrieval · library · api
http://code.google.com/p/wordnetapi/ - cached - mail it - history
WordNet visualization using Force-Directed Graphs
by otis 2009-05-15 10:01 wordnet · synonyms · visual · graph · NLP · information retrieval
http://code.google.com/p/synonym/ - cached - mail it - history
by otis 2009-05-14 17:13 django · solr · python · information retrieval · search
http://code.google.com/p/django-solr-search/ - cached - mail it - history
The Linguistic Data Consortium supports language-related education, research and technology development by creating and sharing linguistic resources: data, tools and standards.
by otis 2009-05-03 17:59 language · NLP · information retrieval · computational linguistics · model · data mining
http://www.ldc.upenn.edu/ - cached - mail it - history
A tool for the estimation, representation, and computation of statistical language models.
by otis 2009-05-03 17:54 NLP · information retrieval · language · computational linguistics · tool · software
http://sourceforge.net/projects/irstlm/ - cached - mail it - history
Moses is a statistical machine translation system that allows you to automatically train translation models for any language pair. All you need is a collection of translated texts (parallel corpus).
by otis 2009-05-03 17:47 NLP · machine translation · information retrieval · language · software · tool
http://www.statmt.org/moses/ - cached - mail it - history
Packages to facilitate natural language processing under Ubuntu Linux and other Debian-based platforms. The goal of Ubuntu NLP is to provide up-to-date packages for commonly used tools that can be easily installed and smoothly integrated into existing systems.
by otis 2009-05-03 17:45 ubuntu · linux · NLP · tool · information retrieval
http://cl.naist.jp/~eric-n/ubuntu-nlp/ - cached - mail it - history
by otis 2009-03-08 00:36 lucene · search · query expansion · information retrieval
http://grasia.fdi.ucm.es/jose/query-expansion/ - cached - mail it - history
Word-aligned Compression library for java
by otis 2009-03-02 11:59 java · api · library · compress · information retrieval · encode
http://code.google.com/p/javaewah/ - cached - mail it - history
UIMA NLP Components
by otis 2009-02-27 12:59 java · UIMA · pipeline · NLP · information retrieval · software
http://www.julielab.de/Resources/Software/Tools.html - cached - mail it - history
Lucas is a UIMA CAS consumer component which bridges the UIMA framework with the Lucene search engine library. Lucas maps CASes to lucene index documents according to a mapping file .
by otis 2009-02-27 12:57 java · UIMA · lucene · index · search · pipeline · software · information retrieval
https://www.coling.uni-jena.de/sites/lucas/index.html - cached - mail it - history
Train: crawl, parse, create clusters Then: crawl, classify new pages into predefined classes/clusters
by otis 2009-02-26 23:54 Heritrix · classification · cluster · crawl · vertical search · focused crawl · information retrieval · NLP
http://webteam.archive.org/confluence/display/SOC06/Crawl-by-example - cached - mail it - history
by otis 2009-02-17 02:55 .net · solr · client · software · search · library · information retrieval
http://code.google.com/p/solrnet/ - cached - mail it - history
Set Operation implementations for SortedIntegerSegments for inverted list caching in search engines. The implementations also include P4Delta compression algorithm based DocIdSet for iterating over DocIdSets in a compressed form.
by otis 2009-02-09 01:25 lucene · search · index · compress · information retrieval · set · java
http://code.google.com/p/lucene-ext/ - cached - mail it - history
by otis 2009-01-08 17:46 Daniel Tunkelang · information retrieval · facet · navigate · results · search · endeca · set · presentation
http://yahoo.hosted.panopto.com/CourseCast/Viewer/Default.aspx?id=6d0a6847-be51-4d29-8c1c-f961274b5343 - cached - mail it - history
by otis 2008-12-23 14:10 collocations · term · summary · NLP · information retrieval · search · keywords · key phrases
http://www.extractor.com/ - cached - mail it - history
WebLA is a Java package for handling Web Graphs, implementing popular algorithms such as PageRank, HITS, CoCitation Similarity and SimRank. It is of particular interest for research in Information Retrieval, since it provides a set of APIs (Application Programming Interfaces) that allow one to easily experiment with such algorithms.
by otis 2008-12-21 01:54 information retrieval · search · algorithm · pagerank · graph · api · library · java
http://webla.sourceforge.net/ - cached - mail it - history
by otis 2008-12-08 22:46 sentence detection · word segmentation · unicode · java · api · NLP · information retrieval · language
http://icu-project.org/userguide/boundaryAnalysis.html - cached - mail it - history
by otis 2008-10-19 22:51 java · api · string · similar · metrics · computational linguistics · NLP · information retrieval · machine learning
http://www.dcs.shef.ac.uk/~sam/simmetrics.html - cached - mail it - history
by otis 2008-09-29 11:03 information extraction · information retrieval · NLP · howto · term
http://chungwon.blogspot.com/2007/08/term-clustering-for-domain-ontology_02.html - cached - mail it - history
by otis 2008-08-18 13:30 search · search engine · information retrieval · vector space · linear algebra
http://mathdl.maa.org/mathDL/4/?pa=content&sa=viewDocument&nodeId=636&pf=1 - cached - mail it - history
Interviews with "Search Wizards" - people from the world of IR, NLP...
by otis 2008-06-11 12:06 search · people · interview · information retrieval · NLP
http://www.arnoldit.com/search-wizards-speak/ - cached - mail it - history
by otis 2008-06-07 01:05 Doug Cutting · video · lecture · lucene · nutch · information retrieval
http://videolectures.net/iiia06_cutting_ense/ - cached - mail it - history
by otis 2008-06-05 13:14 search · information retrieval · facet · explore · discover · search results · Peter Morville
http://www.slideshare.net/morville/search-patterns/ - cached - mail it - history
by otis 2008-05-29 15:31 perl · module · library · api · NLP · information retrieval · ngram
http://ngram.sourceforge.net/ - cached - mail it - history
Search Engine with a web crawler that can be trained to classify pages and crawl only "interesting" pages. Uses Lucene under the hood. Fully distributed and capable of large scale crawling and searching.
by otis 2008-05-22 17:27 search · search engine · crawl · java · software · bayes · classification · index · plugin · lucene · information retrieval
http://hounder.org/ - cached - mail it - history
Dr. Porter's solution that finds significant terms in a document with respect to the rest of the corpus, can collect user profiles based on documents they are viewing, can thus help ad targeting, etc. etc.
by otis 2008-04-25 17:36 Martin Porter · information retrieval · summary · document
http://www.grapeshot.co.uk/ - cached - mail it - history
1 - 50 of 94 next »  
Related Tags
 
- exclude ~ optional + require
Add Dates