links · people · groups · tags | My: links · tags · groups · watchlists · notes login · sign up now! | help · blog
Simpy simpy
 
Pedro Ângelo, member since Mar 11, 2006
.
Search Everyone: "text.processing",
1 - 10 of 43 next »   Watch jstone
 
The current UNIX® text processing tools are weakened by the built-in concept of a line. There is a simple notation that can describe the `shape' of files when the typical array-of-lines picture is inadequate. That notation is regular expressions. Using regular expressions to describe the structure in addition to the contents of files has interesting applications, and yields elegant methods for dealing with some problems the current tools handle clumsily. When operations using these expressions are composed, the result is reminiscent of shell pipelines
by jstone 2009-11-09 16:01 regex · text.processing · research · paper · reference
http://doc.cat-v.org/bell_labs/structural_regexps/ - cached - mail it - history
The sregex module implements Structural Regular Expressions in Python
by jstone 2009-11-09 15:59 python · programming · library · text.processing · regex · free.software · open.source
http://code.google.com/p/sregex/ - cached - mail it - history
Rudel is a collaborative editing environment for GNU Emacs. It supports multiple backends to enable communication with other collaborative editors using different protocols (most notably Gobby).
by jstone 2009-09-23 12:03 emacs · collaboration · text.processing · editing · extensions · tools · free.software · open.source
http://rudel.sourceforge.net/ - cached - mail it - history
Zebra is a high-performance, general-purpose structured text indexing and retrieval engine. It reads structured records in a variety of input formats (eg. email, XML, MARC) and allows access to them through exact boolean search expressions and relevance-ranked free-text queries.
by jstone 2009-07-09 22:37 text.processing · data.mining · search.engine · server · documentation · free.culture · open.source · bibliography
http://www.indexdata.com/zebra - cached - mail it - history
Whoosh is a fast, featureful full-text indexing and searching library implemented in pure Python.
by jstone 2009-02-12 12:48 python · text.processing · library · programming · search · free · open.source
http://whoosh.ca/ - cached - mail it - history
Efuns is a small text editor, entirely written in Objective-Caml.
by jstone 2008-12-09 14:12 emacs · editing · text.processing · programming · tools · ocaml · free · open.source
http://pauillac.inria.fr/cdrom/prog/unix/efuns/eng.htm - cached - mail it - history
Pyndexter (pronounced 'poindexter') is an abstraction layer for full-text indexing engines. It presents a uniform query syntax to the user, includes a basic but functional pure-Python indexer, and has adapters for Hype, Hyperestraier, Lucene, Lupy, Pyndex, Swish-e and Xapian
by jstone 2008-09-14 19:59 search · programming · python · library · free · open.source · text.processing · metadata · abstraction
http://swapoff.org/wiki/pyndexter - cached - mail it - history
Snowball is a small string processing language designed for creating stemming algorithms for use in Information Retrieval. This site describes Snowball, and presents several useful stemmers which have been implemented using it.
by jstone 2008-09-14 19:57 text.processing · parsing · algorithm · domain.specific.languages · repository · reference · programming · compiler · free · open.source
http://snowball.tartarus.org/index.php - cached - mail it - history
GrassyKnoll is a document storage and search engine written in Python.
by jstone 2008-09-14 18:20 search · python · storage · server · library · programming · free · open.source · text.processing · REST
http://code.google.com/p/grassyknoll/ - cached - mail it - history
Character encoding auto-detection in Python. As smart as your browser. Open source.
by jstone 2008-09-11 19:32 python · encoding · text.processing · library · programming · internationalization · free · open.source
http://chardet.feedparser.org/ - cached - mail it - history
1 - 10 of 43 next »  
Related Tags
 
- exclude ~ optional + require
Add Dates