Re: [PROPOSAL] Shared Ispell dictionaries

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Arthur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PROPOSAL] Shared Ispell dictionaries
Date: 2018-03-24 20:56:36
Message-ID: 28250.1521924996@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Arthur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru> writes:
> [ v10 patch versions ]

I took a quick look through this. I agree with the comments about
mmap-ability not being something we should insist on now, and maybe
not ever. However, in order to keep our options open, it seems like
we should minimize the amount of API we expose that's based on the
current implementation. That leads me to the following thoughts:

* I cannot imagine a use-case for setting max_shared_dictionaries_size
to anything except "unlimited". If it's not that, and you exceed it,
then subsequent backends load private copies of the dictionary, making
your memory situation rapidly worse not better. I think we should lose
that GUC altogether and just load dictionaries automatically.

* Similarly, I see no point in a "sharable" option on individual
dictionaries, especially when there's only one allowed setting for
most dictionary types. Let's lose that too.

* And that leads us to not particularly need a view telling which
dictionaries are loaded, either. It's just an implementation detail
that users don't need to worry about.

This does beg the question of whether we need a way to flush dictionary
contents that's short of restarting the server (or short of dropping and
recreating the dictionary). I'm not sure, but even if we do, none of
the above is necessary for that.

I do think it's required that changing the dictionary's options with
ALTER TEXT SEARCH DICTIONARY automatically cause a reload; but if that's
happening with this patch, I don't see where. (It might work to use
the combination of dictionary OID and TID of the dictionary's pg_ts_dict
tuple as the lookup key for shared dictionaries. Oh, and have you
thought about the possibility of conflicting OIDs in different DBs?
Probably the database OID has to be part of the key, as well.)

Also, the scheme for releasing the dictionary DSM during
RemoveTSDictionaryById is uncertain and full of race conditions:
the DROP might roll back later, or someone might come along and
start using the dictionary (causing a fresh DSM load) before the
DROP commits and makes the dictionary invisible to other sessions.
I don't think that either of those are necessarily fatal objections,
but there needs to be some commentary there explaining what happens.

BTW, I was going to complain that this patch alters the API for
dictionary template init functions without any documentation updates;
but then I realized that there isn't any documentation to update.
That pretty well sucks, but I suppose it's not the job of this patch
to improve that situation. Still, you could spend a bit more effort on
the commentary in ts_public.h in 0002, because that commentary is as
close to an API spec as we've got.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-03-24 21:25:02 Re: pgsql: Avoid premature free of pass-by-reference CALL arguments.
Previous Message Andres Freund 2018-03-24 20:36:55 Re: Undesirable entries in typedefs list