Re: patch: tsearch - some memory diet

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch: tsearch - some memory diet
Date: 2010-10-01 18:34:46
Message-ID: AANLkTiktajKxrE9s7_VZppECX47hby3Hx-cbTgdK3W9i@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

2010/10/1 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Mon, Sep 27, 2010 at 11:30 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Tue, Sep 7, 2010 at 1:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> In the particular case here, the dictionary structures could probably
>>> safely use such a context type, but I'm not sure it's worth bothering
>>> if the long-term plan is to implement a precompiler.  There would be
>>> no need for this after the precompiled representation is installed,
>>> because that'd just be one big hunk of memory anyway.
>>
>> Rather than inventing something more complex, I'm inclined to say we
>> should just go ahead and apply this more or less as Pavel wrote it.  I
>> haven't tried to reproduce Pavel's results, but I assume that they are
>> accurate and that's a pretty big savings for a pretty trivial amount
>> of code.  If it gets thrown away later when/if someone codes up a
>> precompiler, no harm done.
>
> I tried to reproduce Pavel's results this afternoon and failed.  I
> read the documentation:
>
> http://developer.postgresql.org/pgdocs/postgres/textsearch-dictionaries.html#TEXTSEARCH-ISPELL-DICTIONARY
>
> ...and I followed the link to ispell.  And I installed it from
> MacPorts.  And then I built it by hand, too.  And I'm still confused.
> Because I don't see anything in either set of results that looks like
> the right set of files to use with CREATE TEXT SEARCH DICTIONARY.
> What am I doing wrong?
>

download http://www.pgsql.cz/data/czech.tar.gz and unpack it to
pgsql/share/tsearch_data

as superuser do

CREATE TEXT SEARCH DICTIONARY cspell
(template=ispell, dictfile = czech, afffile=czech, stopwords=czech);
CREATE TEXT SEARCH CONFIGURATION cs (copy=english);
ALTER TEXT SEARCH CONFIGURATION cs
ALTER MAPPING FOR word, asciiword WITH cspell, simple;

and then postgres=# select * from ts_debug('cs','Příliš žluťoučký kůň
se napil žluté vody');

maybe try to read
http://www.april-child.com/blog/2007/06/25/tsearch2-utf8-czech-czech-utf-8-support-for-tsearch2-postgresql-82/

Regards

Pavel Stehule

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2010-10-01 18:53:49 Re: INSERT ... VALUES... with ORDER BY / LIMIT
Previous Message Andrew Dunstan 2010-10-01 18:28:11 Re: So git pull is shorthand for what exactly?