Re: [PROPOSAL] Shared Ispell dictionaries

From: Arthur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PROPOSAL] Shared Ispell dictionaries
Date: 2018-03-31 21:31:16
Message-ID: CAKNkYnzNJ1NoWaEUcCPDFjfKar1VPeNwxaZ0WyCvxoY+dhWEkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tomas Vondra wrote:
>
> On 03/31/2018 12:42 PM, Arthur Zakirov wrote:
> > Hello all,
> >
> > I'd like to add new optional function to text search template named fini
> > in addition to init() and lexize(). It will be called by
> > RemoveTSDictionaryById() and AlterTSDictionary(). dispell_fini() will
> > call ts_dict_shmem_release().
> >
> > It doesn't change segments leaking situation. I think it makes text
> > search API more transparent.
> >
>
> If it doesn't actually solve the problem, why add it? I don't see a
> point in adding functions for the sake of transparency, when it does not
> in fact serve any use cases.

It doesn't solve the problem. But it brings more clearness, if a dictionary
requested shared location then it should release/unpin it. There are no
such scenario yet, but someone might want to release not only shared
segment but also other private data.

Can't we handle the segment-leaking by adding some sort of tombstone?

It is interesting that there are such tombstones already, without the
patch. TSDictionaryCacheEntry entries aren't deleted after DROP, they are
just marked isvalid = false.

> For example, imagine that instead of removing the hash table entry we
> mark it as 'dropped'. And after that, after the lookup we would know the
> dictionary was removed, and the backends would load the dictionary into
> their private memory.
>
> Of course, this could mean we end up having many tombstones in the hash
> table. But those tombstones would be tiny, making it less painful than
> potentially leaking much more memory for the dictionaries.

Now actually Isn't guaranteed that the hash table entry will be removed.
Even if refcnt is 0. So I think I should remove refcnt and entries won't be
removed.

There are no big problems with leaking now. Memory may leak only if a
dictionary was dropped or altered and there is no text search workload
anymore and the backend still alive. Because next using of text search
functions will unpin segments used before for invalid dictionaries (isvalid
== false). Also the segment is unpinned if the backend terminates. The
segment is destroyed when all interested processes unpin the segment (as
Tom noticed), the hash table entry becomes tombstone.

I hope I described clear.

> Also, I wonder if we might actually remove the dictionaries after a
> while, e.g. based on XID. Imagine that we note the XID of the
> transaction removing the dictionary, or perhaps XID of the most recent
> running transaction. Then we could use this to decide if all running
> transactions actually see the DROP, and we could remove the tombstone.

Maybe autovacuum should work here too :) It is joke of course. I'm not very
aware of removing dead tuples, but I think here is similar case.

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-03-31 21:56:06 Re: [HACKERS] A design for amcheck heapam verification
Previous Message Tomas Vondra 2018-03-31 21:06:54 Re: [HACKERS] [PATCH] Incremental sort