Re: patch: preload dictionary new version

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch: preload dictionary new version
Date: 2010-07-08 14:18:40
Message-ID: 24354.1278598720@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> 2010/7/8 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> A precompiler can give you all the same memory management benefits.

> I use mmap(). And with mmap the precompiler are not necessary.
> Dictionary is loaded only one time - in original ispell format. I
> think, it is much more simple for administration - just copy ispell
> files. There are not some possible problems with binary
> incompatibility, you don't need to solve serialisation,
> deserialiasation, ...you don't need to copy TSearch ispell parser code
> to client application - probably we would to support not compiled
> ispell dictionaries still. Using a precompiler means a new questions
> for upgrade!

You're inventing a bunch of straw men to attack. There's no reason that
a precompiler approach would have to put any new requirements on the
user. For example, the dictionary-load code could automatically execute
the precompile step if it observed that the precompiled copy of the
dictionary was missing or had an older file timestamp than the source.

I like the idea of a precompiler step mainly because it still gives you
most of the benefits of the patch on platforms without mmap. (Instead
of mmap'ing, just open and read() the precompiled file.) In particular,
you would still have a creditable improvement for Windows users without
writing any Windows-specific code.

> I think we can divide this problem to three parts

> a) simple allocator - it can be used not only for TSearch dictionaries.

I think that's a waste of time, frankly. There aren't enough potential
use cases.

> b) sharing a data - it is important for large dictionaries

Useful but not really essential.

> c) preloading - it decrease load time of first TSearch query

This is the part that is the make-or-break benefit of the patch.
You need a solution that cuts load time even when mmap isn't
available.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-07-08 14:40:35 Re: [v9.1] Add security hook on initialization of instance
Previous Message Robert Haas 2010-07-08 13:52:55 Re: Reviewfest 2010-06 Plans and Call for Reviewers