Re: snowball ASCII stemmer configuration

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: snowball ASCII stemmer configuration
Date: 2020-06-19 13:44:20
Message-ID: 1646699.1592574260@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
> Do we *have* to have an ASCII stemmer that corresponds to an actual
> language? Couldn't we use the simple stemmer or no stemmer at all?
> In my experience, ASCII text in, say, Russian or Greek will typically be
> acronyms or brand names or the like, and there doesn't seem to be a
> great need to stem that kind of thing. Just doing nothing seems at
> least as good.

Well, I have no horse in this race. But the reason it's like this for
Russian is that Oleg, Teodor, and crew set it up that way ages ago.
I'd tend to defer to their opinion about what's the most usable
configuration for Russian. You could certainly argue that the situation
is different for $other-language ... but without some hard evidence for
that position, making these cases all behave similarly seems like a
reasonable approach.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-06-19 14:00:51 Re: doing something about the broken dynloader.h symlink
Previous Message Bruce Momjian 2020-06-19 13:02:57 Re: Global snapshots