From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Arthur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PROPOSAL] Shared Ispell dictionaries |
Date: | 2018-01-13 21:33:14 |
Message-ID: | 404f06c5-c7a3-3313-1a64-25f08ba089ad@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 01/13/2018 04:22 PM, Arthur Zakirov wrote:
> Hello,
>
> Thank you Tomas for your review.
>
> On Sat, Jan 13, 2018 at 03:25:55AM +0100, Tomas Vondra wrote:
>> allocate memory in the buildCxt. What about adding tmpstrdup to copy a
>> string into the context? I admit this is mostly nitpicking though.
>
> ... snip ...>
>> 8) One more thing - I've noticed that the hash table uses this key:
>> ...
>> That is, full paths to the two files, and I'm not sure that's a very
>> good idea. Firstly, it's a bit wasteful (1kB per path). But more
>> importantly it means all dictionaries referencing the same files will
>> share the same chunk of shared memory - not only within a single
>> database, but across the whole cluster. That may lead to surprising
>> behavior, because e.g. unloading a dictionary in one database will
>> affect dictionaries in all other databases referencing the same files.
>
> Hm, indeed. It's worth to use only file names instead full paths. And
> it is good idea to use more information besides file names. It can be
> Oid of a database and Oid of a namespace maybe, because a
> dictionary can be created in different schemas.
>
I doubt using filenames (without the directory paths) solves anything,
really. The keys still have to be MAXPGPATH because someone could create
very long filename. But I don't think memory consumption is such a big
deal, really. With 1000 dictionaries it's still just ~2MB of data, which
is negligible compared to the amount of memory saved by sharing the
dictionaries.
Not sure if we really need to add the database/schema OIDs. I mentioned
the unexpected consequences (cross-db sharing) but maybe that's a
feature we should keep (it reduces memory usage). So perhaps this should
be another CREATE TEXT SEARCH DICTIONARY parameter, allowing sharing the
dictionary with other databases?
Aren't we overengineering this?
> I think your proposals may be implemented in several patches, so they can
> be applyed independently but consistently. I suppose I will prepare new
> version of the patch with fixes and with initial design of new functions
> and commands soon.
>
Yes, splitting patches into smaller, more focused bits is a good idea.
BTW the current patch fails to document the dictionary sharing. It only
mentions it when describing the shared_dictionaries GUC. IMHO the right
place for additional details is
https://www.postgresql.org/docs/10/static/textsearch-dictionaries.html
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2018-01-13 21:52:26 | Re: proposal: alternative psql commands quit and exit |
Previous Message | Michael Meskes | 2018-01-13 20:17:45 | Re: [PATCH] ECPG bug fix in preproc when indicator struct is shorter than record struct |