Quick Links

Re: patch: preload dictionary new version

From:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: patch: preload dictionary new version
Date:	2010-07-09 06:44:35
Message-ID:	AANLkTimgw4N_rNFpJANboidG9O5oCdkzGqdKHO0O2jCG@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

2010/7/8 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> 2010/7/8 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>> A precompiler can give you all the same memory management benefits.
>
>> I use mmap(). And with mmap the precompiler are not necessary.
>> Dictionary is loaded only one time - in original ispell format. I
>> think, it is much more simple for administration - just copy ispell
>> files. There are not some possible problems with binary
>> incompatibility, you don't need to solve serialisation,
>> deserialiasation, ...you don't need to copy TSearch ispell parser code
>> to client application - probably we would to support not compiled
>> ispell dictionaries still. Using a precompiler means a new questions
>> for upgrade!
>
> You're inventing a bunch of straw men to attack. There's no reason that
> a precompiler approach would have to put any new requirements on the
> user. For example, the dictionary-load code could automatically execute
> the precompile step if it observed that the precompiled copy of the
> dictionary was missing or had an older file timestamp than the source.

uff - just safe activation of precompiler needs lot of low level code
- but maybe I see it wrong, and I doesn't work directly with files
inside pg. But I can't to see it as simple solution.

>
> I like the idea of a precompiler step mainly because it still gives you
> most of the benefits of the patch on platforms without mmap. (Instead
> of mmap'ing, just open and read() the precompiled file.) In particular,
> you would still have a creditable improvement for Windows users without
> writing any Windows-specific code.
>

the loading cca 10 MB takes on my comp cca 30 ms - it is better than
90ms, but it isn't a win.

>> I think we can divide this problem to three parts
>
>> a) simple allocator - it can be used not only for TSearch dictionaries.
>
> I think that's a waste of time, frankly. There aren't enough potential
> use cases.
>
>> b) sharing a data - it is important for large dictionaries
>
> Useful but not really essential.
>
>> c) preloading - it decrease load time of first TSearch query
>
> This is the part that is the make-or-break benefit of the patch.
> You need a solution that cuts load time even when mmap isn't
> available.
>

I am not sure if this existing, and if it is necessary. Probably main
problem is with Czech language - we have a few specialities. For Czech
environment is UNIX and Windows platform the most important. I have
not information about using Postgres and Fulltext on other platforms
here. So, probably the solution doesn't need be core. I am thinking
about some pgfoundry project now - some like ispell dictionary
preload.

I can send only simplified version without preloading and sharing.
Just solving a memory issue - I think so there are not different
opinions.

best regards

Pavel Stehule

> regards, tom lane
>

In response to

Re: patch: preload dictionary new version at 2010-07-08 14:18:40 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Pavel Stehule	2010-07-09 07:40:48	Re: patch (for 9.1) string functions
Previous Message	KaiGai Kohei	2010-07-09 04:56:29	Re: Bug? Concurrent COMMENT ON and DROP object