Re: [PROPOSAL] Shared Ispell dictionaries

From: Arthur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PROPOSAL] Shared Ispell dictionaries
Date: 2018-03-25 09:53:41
Message-ID: 20180325095340.GA1370@arthur.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 24, 2018 at 04:56:36PM -0400, Tom Lane wrote:
> Arthur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru> writes:
> > [ v10 patch versions ]

Thank you for the review, Tom!

Tomas Vondra wrote:
> Tom Lane wrote:
>> * I cannot imagine a use-case for setting max_shared_dictionaries_size
>> to anything except "unlimited". If it's not that, and you exceed it,
>> then subsequent backends load private copies of the dictionary, making
>> your memory situation rapidly worse not better. I think we should lose
>> that GUC altogether and just load dictionaries automatically.
>
> Introduction of that limit is likely my fault. It came from from an
> extension I wrote a long time ago, but back then it was a necessity
> because we did not have DSM. So in retrospect I agree with you - it's
> not particularly useful and we should ditch it.
>
> Arthur, let this be a lesson for you! You have to start fight against
> bogus feature requests from other people ;-)

Yeah, in this sense max_shared_dictionaries_size is pointless. I'll
remove it then :).

> * Similarly, I see no point in a "sharable" option on individual
> dictionaries, especially when there's only one allowed setting for
> most dictionary types. Let's lose that too.

I think "Shareable" option could be useful if a shared dictionary
building time was much longer than a non-shared dictionary building
time. It is slightly longer because of additional memcpy(), but isn't
noticable I think. So it is worth to remove it.

> * And that leads us to not particularly need a view telling which
> dictionaries are loaded, either. It's just an implementation detail
> that users don't need to worry about.

If all dictionaries will be shareable then this view could be removed.
Unfortunately I think it can't help with leaked segments, I didn't find
a way to iterate dshash entries. That's why pg_ts_shared_dictionaries()
scans pg_ts_dict table instead of scanning dshash table.

> I do think it's required that changing the dictionary's options with
> ALTER TEXT SEARCH DICTIONARY automatically cause a reload; but if that's
> happening with this patch, I don't see where. (It might work to use
> the combination of dictionary OID and TID of the dictionary's pg_ts_dict
> tuple as the lookup key for shared dictionaries. Oh, and have you
> thought about the possibility of conflicting OIDs in different DBs?
> Probably the database OID has to be part of the key, as well.)

Yes unfortunately ALTER TEXT SEARCH DICTIONARY doesn't reload a
dictionary. TID can help here. I thought about using XID too when I
started to work on RELOAD command. But I'm not sure that it is a good
idea, anyway XID isn't needed in current version.

> Also, the scheme for releasing the dictionary DSM during
> RemoveTSDictionaryById is uncertain and full of race conditions:
> the DROP might roll back later, or someone might come along and
> start using the dictionary (causing a fresh DSM load) before the
> DROP commits and makes the dictionary invisible to other sessions.
> I don't think that either of those are necessarily fatal objections,
> but there needs to be some commentary there explaining what happens.

I missed this case. As you wrote below ROLLBACK case is ok. But I
haven't a soluton for the second case for now. If I won't solve it I'll
add additional comments in RemoveTSConfigurationById() and maybe in the
documentation if it's appropriate.

> BTW, I was going to complain that this patch alters the API for
> dictionary template init functions without any documentation updates;
> but then I realized that there isn't any documentation to update.
> That pretty well sucks, but I suppose it's not the job of this patch
> to improve that situation. Still, you could spend a bit more effort on
> the commentary in ts_public.h in 0002, because that commentary is as
> close to an API spec as we've got.

I'll fix the comments.

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arthur Zakirov 2018-03-25 10:02:08 Re: [PROPOSAL] Shared Ispell dictionaries
Previous Message Michael Paquier 2018-03-25 08:06:13 Re: Using base backup exclusion filters to reduce data transferred with pg_rewind