Re: [PROPOSAL] Shared Ispell dictionaries

From: Arthur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PROPOSAL] Shared Ispell dictionaries
Date: 2019-02-20 14:33:44
Message-ID: 26e59c3b-3598-cc2c-6b8d-81d24d6d0930@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

I've created the new commitfest entry since the previous entry was
closed with status "Returned with feedback":

https://commitfest.postgresql.org/22/2007/

I attached new version of the patch. There are changes only in
0003-Retrieve-shared-location-for-dict-v18.patch.

I added a reference counter to shared hash tables dictionary entries. It
is necessary to not face memory bloat. It is necessary to delete shared
hash table entries if there are a lot of ALTER and DROP TEXT SEARCH
DICTIONARY.

Previous version of the patch had released unused DSM segments but left
shared hash table entries untouched.

There was refcnt before:

https://www.postgresql.org/message-id/20180403115720.GA7450%40zakirov.localdomain

But I didn't fully understand how on_dsm_detach() works.

On 22.01.2019 22:17, Tomas Vondra wrote:
> I think there are essentially two ways:
>
> (a) Define max amount of memory available for shared dictionarires, and
> come up with an eviction algorithm. This will be tricky, because when
> the frequently-used dictionaries need a bit more memory than the limit,
> this will result in trashing (evict+load over and over).
>
> (b) Define what "unused" means for dictionaries, and unload dictionaries
> that become unused. For example, we could track timestamp of the last
> time each dict was used, and decide that dictionaries unused for 5 or
> more minutes are unused. And evict those.
>
> The advantage of (b) is that it adopts automatically, more or less. When
> you have a bunch of frequently used dictionaries, the amount of shared
> memory increases. If you stop using them, it decreases after a while.
> And rarely used dicts won't force eviction of the frequently used ones.
I'm working on the (b) approach. I thought about a priority queue
structure. There no such ready structure within PostgreSQL sources
except binaryheap.c, but it isn't for concurrent algorithms.

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Attachment Content-Type Size
0001-Fix-ispell-memory-handling-v18.patch text/x-patch 1.5 KB
0002-Change-tmplinit-argument-v18.patch text/x-patch 12.4 KB
0003-Retrieve-shared-location-for-dict-v18.patch text/x-patch 21.4 KB
0004-Store-ispell-in-shared-location-v18.patch text/x-patch 90.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-02-20 14:38:25 Re: WIP: Avoid creation of the free space map for small tables
Previous Message Antonin Houska 2019-02-20 14:29:20 Unnecessary checks for new rows by some RI trigger functions?