http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/proceedings/&toc=comp/proceedings/hicss/1999/0001/02/00...
- cached
- mail it
- history
Cavnar, Trenkle (1994) - the popular paper behind TextCat et al.
The ranking algorithm is kind of screwy, until you think of it as editing distance in an alphabet where each n-gram is a distinct symbol. Maybe it's still screwy.