Re: [PROPOSAL] Shared Ispell dictionaries

From: Arthur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PROPOSAL] Shared Ispell dictionaries
Date: 2018-03-28 13:51:45
Message-ID: 20180328135143.GA13322@zakirov.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Please find the attached new version of the patch.

I got rid of 0005 and 0006 parts. There are no
max_shared_dictionaries_size variable, Shareable option,
pg_ts_shared_dictionaries view anymore.

On Sat, Mar 24, 2018 at 04:56:36PM -0400, Tom Lane wrote:
> I do think it's required that changing the dictionary's options with
> ALTER TEXT SEARCH DICTIONARY automatically cause a reload; but if that's
> happening with this patch, I don't see where. (It might work to use
> the combination of dictionary OID and TID of the dictionary's pg_ts_dict
> tuple as the lookup key for shared dictionaries. Oh, and have you
> thought about the possibility of conflicting OIDs in different DBs?
> Probably the database OID has to be part of the key, as well.)

The database OID, the dictionary OID, TID and XMIN are used now as
lookup key.

> Also, the scheme for releasing the dictionary DSM during
> RemoveTSDictionaryById is uncertain and full of race conditions:
> the DROP might roll back later, or someone might come along and
> start using the dictionary (causing a fresh DSM load) before the
> DROP commits and makes the dictionary invisible to other sessions.
> I don't think that either of those are necessarily fatal objections,
> but there needs to be some commentary there explaining what happens.

The dictionary's DSM segment is alive till postmaster terminates now.
But when the dictionary is dropped or altered then the previous
(invalid) segment is unpinned. The segment itself is released when all
backends unpins mapping in lookup_ts_parser_cache() or by disconnecting.

The problem here comes when the dictionary was used before dropping or
altering by some process, isn't used after and the process lives a very
long time. In this situation the mapping isn't unpinned and the segment
isn't released. The other problem is that TsearchDictEntry isn't removed
if ts_dict_shmem_release() wasn't called. It may happen after dropping
the dictionary.

> BTW, I was going to complain that this patch alters the API for
> dictionary template init functions without any documentation updates;
> but then I realized that there isn't any documentation to update.
> That pretty well sucks, but I suppose it's not the job of this patch
> to improve that situation. Still, you could spend a bit more effort on
> the commentary in ts_public.h in 0002, because that commentary is as
> close to an API spec as we've got.

I improved a little bit the commentary in ts_public.h.

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Attachment Content-Type Size
0001-Fix-ispell-memory-handling-v11.patch text/plain 1.2 KB
0002-Change-tmplinit-argument-v11.patch text/plain 11.6 KB
0003-Retreive-shared-location-for-dict-v11.patch text/plain 14.4 KB
0004-Store-ispell-in-shared-location-v11.patch text/plain 89.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nicolas Thauvin 2018-03-28 13:54:11 segfault due to invalid cached plan
Previous Message Ashutosh Bapat 2018-03-28 13:51:06 Re: [HACKERS] Partition-wise aggregation/grouping