Re: Use of "token" vs "lexeme" in text search documentation

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-docs(at)postgreSQL(dot)org, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: Use of "token" vs "lexeme" in text search documentation
Date: 2007-10-16 16:04:12
Message-ID: Pine.LNX.4.64.0710162003020.25678@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

On Mon, 15 Oct 2007, Tom Lane wrote:

> The current documentation seems a bit inconsistent in its use of the
> terms "token" and "lexeme". The majority of the text seems to use
> "lexeme" exclusively, which is inconsistent with the fact that the
> term "token" is exposed by ts_token_type() and friends. But there
> are a few places that seem to use "lexeme" to mean something returned
> by a dictionary.
>
> I was considering trying to adopt these conventions:
>
> * What a parser returns is a "token".
>
> * When a dictionary recognizes a token, what it returns is a "lexeme".
>
> This would make the phrase "normalized lexeme" redundant, since we
> don't call it a lexeme at all unless it's been normalized.
>
> Comments?

Hmm, you say what I always thought. I'd be happy if you stress this in
docs.

>
> regards, tom lane
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Robert Treat 2007-10-16 18:21:37 Re: Slony for upgrades
Previous Message Magnus Hagander 2007-10-16 12:14:50 Re: correct reference external-projects.sgml