links · people · groups · tags | My: links · tags · groups · watchlists · notes login · sign up now! | help · blog
Simpy simpy
 
Search Everyone: "nlp",

Top "nlp" experts: rycharde, fyfespencer, otis, goober1024, rdrid, jnioche,

Groups about "nlp": IR NLP ML and CL, onemorebite,

1 - 88 of 88   Watch otis
 
Behemoth allows to deploy GATE or UIMA applications over a Hadoop cluster in order to do very large scale document analysis. It uses a simple representation format which can be used as a common ground between UIMA and GATE-generated annotations, hence achieving compatibility between both systems. Since it is Hadoop-based it benefits from all its features, namely scalability, fault-tolerance and most notably the back up of a thriving open source community. Quite a few Apache resources will fit into it: Nutch, Tika, Mahout, Hbase etc...
by otis 2009-12-01 22:35 UIMA · gate · hadoop · text mining · text analysis · MapReduce · distributed computing · NLP
http://code.google.com/p/behemoth-pebble/ - cached - mail it - history
Speedi.ly takes a piece of content, or grabs the content from a URL, and analyzes it. It does this very fast and it outputs some key data. Speedi.ly tells you the language of the content, categorizes it (topics, keywords), and additional metadata.
by otis 2009-11-23 12:06 classification · service · saas · NLP · named entity extraction
http://www.techcrunch.com/2009/11/20/getting-to-the-supertweet-speedi-ly-classifies-the-real-time-web/ - cached - mail it - history
by otis 2009-11-23 12:05 classification · service · saas · windows · NLP
http://uclassify.com/ - cached - mail it - history
by otis 2009-11-18 17:36 NLP · information retrieval · semantic · gate · software
http://www.semanticsoftware.info/ - cached - mail it - history
by otis 2009-11-18 17:33 morphix · linux · NLP · software
http://morphix-nlp.berlios.de/ - cached - mail it - history
MuNPEx is a multi-lingual noun phrase (NP) extraction component developed for the GATE architecture, implemented in JAPE. It currently supports English, German, French, and Spanish (in beta). MuNPEx requires a part-of-speech (POS) tagger to work and can additionally use detected named entities (NEs) to improve chunking performance. Please read the documentation (or source code) for more details.
by otis 2009-11-18 17:30 NLP · information retrieval · key phrases · information extraction · computational linguistics · software · gate
http://www.semanticsoftware.info/munpex - cached - mail it - history
by otis 2009-11-07 01:34 wikipedia · dump · extract · text · data mining · text mining · corporation · NLP
http://evanjones.ca/software/wikipedia2text.html - cached - mail it - history
PROBABILITIES, STATISTICS AND DATA MODELING
by otis 2009-11-03 16:27 statistics · probability · math · matrix · tutorial · reference · ebook · pdf · NLP
http://www.aiaccess.net/English/Glossaries/Shop/bookstore.htm - cached - mail it - history
by otis 2009-11-02 13:59 software · information retrieval · NLP · perl · corpus · text mining · dataset
http://www.drni.de/wac-tk/ - cached - mail it - history
C++ (but has Java API), GPL
by otis 2009-11-02 13:51 information retrieval · NLP · software · library · api
http://www.lsi.upc.edu/~nlp/freeling/ - cached - mail it - history
by otis 2009-11-02 13:33 NLP · information retrieval · computational linguistics · java · software · api · library
http://herd.ida.liu.se:8180/nlpfarm/ - cached - mail it - history
MaltParser is a system for data-driven dependency parsing, which can be used to induce a parsing model from treebank data and to parse new data using an induced model.
by otis 2009-10-29 16:20 machine learning · parse · computational linguistics · NLP · java · software · api · library
http://maltparser.org/ - cached - mail it - history
Platform where anyone can share and mash open data on any subject
by otis 2009-10-29 12:24 data · information retrieval · NLP · machine learning · mashup
http://www.factual.com/ - cached - mail it - history
Common English misspellings from Wikipedia 4107 misspellings as of 2009-10-29
by otis 2009-10-29 12:20 wikipedia · spell · english · language · search · information retrieval · NLP
http://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines - cached - mail it - history
Sen is the first opensource morphological analyzer written in pure Java.
by otis 2009-10-16 23:39 japanese · morphology · analysis · lucene · search · index · information retrieval · NLP · library
https://sen.dev.java.net/ - cached - mail it - history
by otis 2009-10-07 21:55 NLP · sentiment · reference
http://www.cs.uic.edu/~liub/FBS/NLP-handbook-sentiment-analysis.pdf - cached - mail it - history
OPUS is an attempt to collect translated texts from the web, to convert and align the entire collection, to add linguistic annotation, and to provide the community with a publicly available parallel corpus. OPUS is based on open source products and is also delivered as an open source package.
by otis 2009-10-07 10:54 corpus · information retrieval · NLP
http://urd.let.rug.nl/tiedeman/OPUS/ - cached - mail it - history
Text2Onto is the official successor of TextToOnto, a framework for ontology learning from text.
by otis 2009-09-12 22:36 ontology · corpus · NLP · semantic
http://ontoware.org/projects/text2onto/ - cached - mail it - history
Zemberek is an open source, platform independent, general purpose Natural Language Processing library and toolset designed for Turkic languages, especially Turkish. Zemberek is officially used as spell checker in Open Office Turkish version and Turkish national Linux Distribution Pardus. Google Code will host Zemberek-2, Zemberek Corpus and Wordnet projects. These projects has Mozilla Public License.
by otis 2009-07-24 09:41 turkish · language · analysis · search · tokenizer · stemming · NLP · library
http://code.google.com/p/zemberek/ - cached - mail it - history
Near duplicate detection algorithm for deduplication (deduping)
by otis 2009-07-14 10:30 duplicate detection · NLP
http://nlp.stanford.edu/IR-book/html/htmledition/near-duplicates-and-shingling-1.html - cached - mail it - history
by otis 2009-06-23 23:49 perl · wordnet · similar · software · information retrieval · NLP
http://wn-similarity.sourceforge.net/ - cached - mail it - history
Default dictionary break iterator for Chinese, Japanese, Korean
by otis 2009-06-03 00:15 CJK · japan · chinese · korean · computational linguistics · NLP · information retrieval · search · analysis · word segmentation
http://bugs.icu-project.org/trac/ticket/2229 - cached - mail it - history
by otis 2009-05-18 14:07 chinese · dictionary · english · word · word segmentation · NLP · information retrieval · computational linguistics
http://usa.mdbg.net/chindict/chindict.php?page=cc-cedict - cached - mail it - history
by otis 2009-05-17 22:15 web · index · crawl · dataset · corpus · linguistics · computational linguistics · NLP · information retrieval
http://webascorpus.org/ - cached - mail it - history
WordnetAPI is a Java interface to the famous WordNet database of lexical relationships.
by otis 2009-05-15 10:07 wordnet · morphology · lexical · synonyms · NLP · information retrieval · library · api
http://code.google.com/p/wordnetapi/ - cached - mail it - history
WordNet visualization using Force-Directed Graphs
by otis 2009-05-15 10:01 wordnet · synonyms · visual · graph · NLP · information retrieval
http://code.google.com/p/synonym/ - cached - mail it - history
by otis 2009-05-14 15:07 taxonomy · ontology · facet · NLP · search
http://www.ideaeng.com/tabId/98/itemId/199/Whats-the-difference-between-Taxonomies-and-Ontol.aspx - cached - mail it - history
The Linguistic Data Consortium supports language-related education, research and technology development by creating and sharing linguistic resources: data, tools and standards.
by otis 2009-05-03 17:59 language · NLP · information retrieval · computational linguistics · model · data mining
http://www.ldc.upenn.edu/ - cached - mail it - history
A tool for the estimation, representation, and computation of statistical language models.
by otis 2009-05-03 17:54 NLP · information retrieval · language · computational linguistics · tool · software
http://sourceforge.net/projects/irstlm/ - cached - mail it - history
Moses is a statistical machine translation system that allows you to automatically train translation models for any language pair. All you need is a collection of translated texts (parallel corpus).
by otis 2009-05-03 17:47 NLP · machine translation · information retrieval · language · software · tool
http://www.statmt.org/moses/ - cached - mail it - history
Packages to facilitate natural language processing under Ubuntu Linux and other Debian-based platforms. The goal of Ubuntu NLP is to provide up-to-date packages for commonly used tools that can be easily installed and smoothly integrated into existing systems.
by otis 2009-05-03 17:45 ubuntu · linux · NLP · tool · information retrieval
http://cl.naist.jp/~eric-n/ubuntu-nlp/ - cached - mail it - history
by otis 2009-03-20 22:06 wordnet · synonyms · NLP · database
http://www.globalwordnet.org/ - cached - mail it - history
WikiXMLDB provides a way of querying Wikipedia with XQuery.
by otis 2009-03-14 16:41 wikipedia · xml · xquery · search · knowledge · structure · NLP · data mining
http://wikixmldb.dyndns.org/ - cached - mail it - history
UIMA NLP Components
by otis 2009-02-27 12:59 java · UIMA · pipeline · NLP · information retrieval · software
http://www.julielab.de/Resources/Software/Tools.html - cached - mail it - history
Train: crawl, parse, create clusters Then: crawl, classify new pages into predefined classes/clusters
by otis 2009-02-26 23:54 Heritrix · classification · cluster · crawl · vertical search · focused crawl · information retrieval · NLP
http://webteam.archive.org/confluence/display/SOC06/Crawl-by-example - cached - mail it - history
ClearTK is a toolkit for developing statistical natural language processing components in Java and is based on the Apache UIMA framework for text analysis.
by otis 2009-01-28 16:28 java · api · UIMA · NLP · statistics
http://code.google.com/p/cleartk/ - cached - mail it - history
JLangDetect is a pure Java implementation of a language detector. It provides a toolkit for training language recognition, and a simple implementation of a detector.
by otis 2009-01-17 00:41 language · java · api · NLP · identification
http://www.jroller.com/melix/entry/nlp_in_java_a_language - cached - mail it - history
by otis 2008-12-23 14:10 collocations · term · summary · NLP · information retrieval · search · keywords · key phrases
http://www.extractor.com/ - cached - mail it - history
by otis 2008-12-22 16:36 NLP · collocations · LingPipe
http://lingpipe-blog.com/2008/05/28/collocations-chi-squared-independence-and-n-gram-count-boundary-conditions/ - cached - mail it - history
"Practical Artificial Intelligence Programming in Java, third edition"
by otis 2008-12-21 10:37 book · download · ebook · pdf · AI · NLP · machine learning · java
http://markwatson.com/blog/2008/11/my-new-book-practical-artificial.html - cached - mail it - history
by otis 2008-12-08 22:46 sentence detection · word segmentation · unicode · java · api · NLP · information retrieval · language
http://icu-project.org/userguide/boundaryAnalysis.html - cached - mail it - history
by otis 2008-12-05 17:29 new york times · corpus · NLP
http://groups.google.com/group/nytnlp - cached - mail it - history
by otis 2008-10-19 22:51 java · api · string · similar · metrics · computational linguistics · NLP · information retrieval · machine learning
http://www.dcs.shef.ac.uk/~sam/simmetrics.html - cached - mail it - history
by otis 2008-09-29 11:03 information extraction · information retrieval · NLP · howto · term
http://chungwon.blogspot.com/2007/08/term-clustering-for-domain-ontology_02.html - cached - mail it - history
by otis 2008-09-22 15:28 computational linguistics · NLP
http://www.d.umn.edu/~tpederse/ - cached - mail it - history
Interviews with "Search Wizards" - people from the world of IR, NLP...
by otis 2008-06-11 12:06 search · people · interview · information retrieval · NLP
http://www.arnoldit.com/search-wizards-speak/ - cached - mail it - history
by otis 2008-06-09 09:11 vietnam · word segmentation · language · NLP · java · api · library
http://jvnsegmenter.sourceforge.net/ - cached - mail it - history
Our goal is to mark the sets of arguments that cooccur with nouns in the PropBank Corpus (the Wall Street Journal Corpus of the Penn Treebank), just as PropBank records such information for verbs.
by otis 2008-06-07 15:42 NLP · corpus · annotate
http://nlp.cs.nyu.edu/meyers/NomBank.html - cached - mail it - history
Subject-Verb-Object extraction, idea navigation
by otis 2008-06-06 12:53 search · explore · navigate · discover · collocations · NLP · POS · concept · video · lecture
http://videolectures.net/chi08_zelevinsky_ins/ - cached - mail it - history
by otis 2008-06-04 17:47 JaroWinkler · string · similar · distance · NLP
http://lingpipe-blog.com/2006/12/13/code-spelunking-jaro-winkler-string-comparison/ - cached - mail it - history
by otis 2008-05-31 00:46 data mining · data · text mining · text analysis · NLP
http://www.datasetgenerator.com/ - cached - mail it - history
by otis 2008-05-30 00:20 patricia tree · radix tree · prefix tree · trie · tree · data structure · string · search · NLP
http://code.google.com/p/radixtree/ - cached - mail it - history
by otis 2008-05-29 23:59 suffix tree · patricia tree · trie · tree · data structure · search · string · video · lecture · NLP
http://www.cs.umd.edu/class/fall2004/cmsc132/suffixTree.mov - cached - mail it - history
by otis 2008-05-29 15:40 cluster · categorization · classification · tutorial · reference · algorithm · NLP
http://answers.google.com/answers/main?cmd=threadview&id=225316 - cached - mail it - history
by otis 2008-05-29 15:31 perl · module · library · api · NLP · information retrieval · ngram
http://ngram.sourceforge.net/ - cached - mail it - history
This is a collection of resources in a variety of fields related to text, speech and language processing. These include computational linguistics, information retrieval and machine learning. Here you can find pointers to useful Web sites, as well as lists of relevant books, newsgroups and mailing lists, and much more.
by otis 2008-02-17 16:05 NLP · information retrieval · data mining · information extraction · computational linguistics · resource
http://www.cs.technion.ac.il/~gabr/resources/resources.html - cached - mail it - history
Tagged datasets for named entity recognition tasks
by otis 2008-02-17 16:04 NLP · machine learning · computational linguistics · information retrieval · information extraction · named entity extraction · resource
http://www.cs.technion.ac.il/~gabr/resources/data/ne_datasets.html - cached - mail it - history
syntactic parser for English. With a wide-coverage probabilistic HPSG grammar [1-6] and an efficient parsing algorithm [7-9], this parser can effectively analyze syntactic/semantic structures of English sentences and provide a user with phrase structures and predicate-argument structures. Those outputs would be especially useful for high-level NLP applications including information extraction, automatic summarization, and question answering, where the "meaning" of a sentence plays a central role.
by otis 2008-02-05 13:00 NLP · syntax · english · parse
http://www-tsujii.is.s.u-tokyo.ac.jp/enju/ - cached - mail it - history
Statistical NLP course lessons
by otis 2008-02-03 16:59 NLP · statistics · probability · information theory · hidden markov model · lecture · course
http://www.cs.rochester.edu/u/james/CSC248/ - cached - mail it - history
Aligned multilingual corpus JRC-ACQUIS . The dataset contains resources for the following languages: Bulgarian, Czech, Danish, German, Greek, English, Spanish, Estonian, Finnish, French, Hungarian, Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovene, Swedish.
by otis 2008-01-27 00:45 corpus · language · NLP · multilingual
http://wt.jrc.it/lt/Acquis/ - cached - mail it - history
Semantic Vector indexes, created by applying a Random Projection algorithm to term-document matrices created using Apache Lucene. The package creates a WordSpace model, of the kind developed by Stanford University's Infomap Project and other researchers during the 1990s and early 2000s. Such models are designed to represent words and documents in terms of underlying concepts, and as such can be used for many semantic (concept-aware) matching tasks such as automatic thesaurus generation, knowledge representation, and concept matching. The Semantic Vectors package uses a Random Projection algorithm, a form of automatic semantic analysis, similar to Latent Semantic Analysis (LSA) and its variants like Probabilistic Latent Semantic Analysis (PLSA).
by otis 2008-01-14 02:06 semantic · LSA · PLSA · NLP · information retrieval · java · api
http://code.google.com/p/semanticvectors/ - cached - mail it - history
by otis 2008-01-12 23:51 classification · algorithm · NLP · information retrieval
http://nlpers.blogspot.com/2007/09/bootstrapping.html - cached - mail it - history
Java-based framework designed to support the development of applications for unsupervised machine learning tasks, with a particular focus on their application to text data
by otis 2008-01-12 11:35 java · api · cluster · library · NLP · information retrieval
http://mlg.ucd.ie/content/view/18/ - cached - mail it - history
Unsupervised learning of natural language structure
by otis 2007-12-21 15:19 linguistics · language · morphology · NLP
http://linguistica.uchicago.edu/ - cached - mail it - history
by otis 2007-12-04 16:07 java · language · toolkit · library · NLP · spell
http://www.languagetool.org/ - cached - mail it - history
by otis 2007-11-29 03:24 information retrieval · NLP · paper · howto · article
http://www.basistech.com/knowledge-center/ - cached - mail it - history
by otis 2007-11-12 20:49 Pavel Pecina · collocations · NLP · paper
http://ufal.mff.cuni.cz/~pecina/publications/acl-2005.pdf - cached - mail it - history
by otis 2007-11-11 15:57 vietnam · language · diacritics · NLP · character set
http://vietunicode.sourceforge.net/charset/v3.htm - cached - mail it - history
by otis 2007-11-01 23:30 video · presentation · search · similar · NLP
http://glinden.blogspot.com/2007/10/google-tech-talk-on-similarities.html - cached - mail it - history
search engine for entities: the important (and not so important)people, places, and things in the news. Our news analysis system automatically identifies and monitors these entities, and identifies meaningful relationships between them.
by otis 2007-10-26 11:22 search engine · news · monitor · NLP · named entity extraction
http://www.textmap.com/ - cached - mail it - history
The purpose of Senseval is to evaluate the strengths and weaknesses of such programs with respect to different words, different varieties of language, and different languages.
by otis 2007-10-11 10:46 word sense disambiguation · NLP · evaluation
http://www.senseval.org/ - cached - mail it - history
The Clair library is a suite of open-source Perl modules intended to simplify a number of generic tasks in natural language processing (NLP), information retrieval (IR), and network analysis (NA). Its architecture also allows for external software to be plugged in with very little effort.
by otis 2007-09-24 15:47 perl · NLP · information retrieval · library
http://belobog.si.umich.edu/mediawiki/index.php/Main_Page - cached - mail it - history
In this paper, we attempt to build query networks from web search engine query logs, with the nodes representing queries and the edges exhibiting the semantic relatedness between queries. To build the network, users’ query histories are extracted from query logs and are then segmented into query sessions. Semantic relatedness of queries is modeled using three different statistical measures: collocation, weighted dependence, and mutual information. We compare the constructed query networks with comparable random networks and conclude that query networks are of small world properties. Besides, we propose a method for identifying the community structures, which is representative of semantic taxonomies, by applying Newman clustering to query networks. The experimental evaluation prove the effectiveness of our proposed method against a baseline model.
by otis 2007-09-17 19:41 NLP · information retrieval · paper · search
http://www-personal.umich.edu/~ladamic/si708w07/projects/qla.pdf - cached - mail it - history
by otis 2007-09-10 19:27 NLP
http://nlp.uned.es/mavir/docs/seminarioMAVIR-LMarquez.pdf - cached - mail it - history
by otis 2007-09-10 19:20 book · NLP
http://www.cs.colorado.edu/~martin/slp2.html - cached - mail it - history
NLTK — the Natural Language Toolkit — is a suite of open source Python modules, data sets and tutorials supporting research and development in natural language processing.
by otis 2007-08-31 00:11 python · linguistics · NLP · library · toolkit
http://nltk.sourceforge.net/index.php/Main_Page - cached - mail it - history
Rich NLP resource
by otis 2007-05-22 18:22 NLP · linguistics · resource · information retrieval
http://www-nlp.stanford.edu/links/statnlp.html - cached - mail it - history
Grant Ingersoll's analysis of academic CS papers, focused on IR, NLP, ML, and such
by otis 2007-01-15 02:20 paper · information retrieval · NLP · machine learning · Grant Ingersoll
http://paperoftheweek.com/ - cached - mail it - history
by otis 2006-08-07 17:07 information retrieval · speech · language · NLP · book
http://www.cs.colorado.edu/~martin/SLP/Updates/newtoc.html - cached - mail it - history
Java API with a collection of string matching, similarity, and distance measures
by otis 2005-08-02 16:40 string · matching · NLP · library · java · api · information retrieval · compare · distance · similarity
http://secondstring.sourceforge.net/ - cached - mail it - history
NLP software from Stanford NLP group, written in Java, with GPL
by otis 2005-05-12 14:56 information retrieval · java · framework · gpl · NLP · api
http://www-nlp.stanford.edu/software/index.shtml - cached - mail it - history
Java API for text categorization and other NLP stuff
by otis 2005-05-08 09:16 java · text · information · visual · NLP · api · pattern · category · information retrieval · machine learning · data mining
http://minorthird.sourceforge.net/ - cached - mail it - history
by otis 2004-03-29 23:11 stanford · NLP
http://nlp.stanford.edu/ - cached - mail it - history
by otis 2004-03-29 23:11 NLP · api · library · java
http://opennlp.sourceforge.net/ - cached - mail it - history
1 - 88 of 88  
Related Tags
 
- exclude ~ optional + require
Add Dates