Re: [PROPOSAL] Shared Ispell dictionaries

From: Arthur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Ildus Kurbangaliev <i(dot)kurbangaliev(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PROPOSAL] Shared Ispell dictionaries
Date: 2018-03-02 11:19:25
Message-ID: 20180302111924.GB18933@zakirov.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

Thank you for your comments.

On Thu, Mar 01, 2018 at 08:31:49PM -0800, Andres Freund wrote:
> Hi,
>
> On 2018-02-07 19:28:29 +0300, Arthur Zakirov wrote:
> > + {
> > + {"max_shared_dictionaries_size", PGC_POSTMASTER, RESOURCES_MEM,
> > + gettext_noop("Sets the maximum size of all text search dictionaries loaded into shared memory."),
> > + gettext_noop("Currently controls only loading of Ispell dictionaries. "
> > + "If total size of simultaneously loaded dictionaries "
> > + "reaches the maximum allowed size then a new dictionary "
> > + "will be loaded into local memory of a backend."),
> > + GUC_UNIT_KB,
> > + },
> > + &max_shared_dictionaries_size,
> > + 100 * 1024, 0, MAX_KILOBYTES,
> > + NULL, NULL, NULL
> > + },
>
> So this uses shared memory, allocated at server start? That doesn't
> seem right. Wouldn't it make more sense to have a
> 'num_shared_dictionaries' GUC, and then allocate them with dsm? Or even
> better not have any such limit and us a dshash table to point to
> individual loaded tables?

The patch uses dsm and dshash table already.
'max_shared_dictionaries_size' GUC was introduced after discussion with
Tomas [1]. To limit amount of memory consumed by loaded dictionaries and to
prevent possible memory bloating. Its default value is 100MB.

There was 'shared_dictionaries' GUC before, it was introduced because
usual hash tables was used before, not dshash. I replaced usual hash
tables by dshash, removed 'shared_dictionaries' and added
'max_shared_dictionaries_size'.

> Is there any chance we can instead can convert dictionaries into a form
> we can just mmap() into memory? That'd scale a lot higher and more
> dynamicallly?

I think new IspellDictData structure (in 0003-Store-ispell-structures-in-shmem-v5.patch)
can be stored in a binary file and mapped into memory already. But
mmap() is not used in this patch yet.

I can do some experiments and make a prototype.

1 - https://www.postgresql.org/message-id/d12d9395-922c-64c9-c87d-dd0e1d31440e%402ndquadrant.com

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-03-02 11:21:21 Re: [HACKERS] path toward faster partition pruning
Previous Message Magnus Hagander 2018-03-02 11:16:17 Re: Allow workers to override datallowconn