Quick Links

Use of "token" vs "lexeme" in text search documentation

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-docs(at)postgreSQL(dot)org
Cc:	Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject:	Use of "token" vs "lexeme" in text search documentation
Date:	2007-10-15 19:22:12
Message-ID:	20167.1192476132@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-docs

The current documentation seems a bit inconsistent in its use of the
terms "token" and "lexeme". The majority of the text seems to use
"lexeme" exclusively, which is inconsistent with the fact that the
term "token" is exposed by ts_token_type() and friends. But there
are a few places that seem to use "lexeme" to mean something returned
by a dictionary.

I was considering trying to adopt these conventions:

* What a parser returns is a "token".

* When a dictionary recognizes a token, what it returns is a "lexeme".

This would make the phrase "normalized lexeme" redundant, since we
don't call it a lexeme at all unless it's been normalized.

Comments?

regards, tom lane

Responses

Re: Use of "token" vs "lexeme" in text search documentation at 2007-10-16 16:04:12 from Oleg Bartunov

Browse pgsql-docs by date

	From	Date	Subject
Next Message	Kevin Grittner	2007-10-15 22:31:32	Re: Release notes introductory text
Previous Message	Albert Cervera i Areny	2007-10-14 23:48:55	Re: Tips needed for contrib doc