Re: Hash tables in dynamic shared memory

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Hash tables in dynamic shared memory
Date: 2016-10-05 06:02:42
Message-ID: CABUevEz=fR9__vE0GMVsrEftmMCqHrne62KQh23vLx-k2_5hAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Oct 5, 2016 1:23 AM, "Thomas Munro" <thomas(dot)munro(at)enterprisedb(dot)com>
wrote:
>
> On Wed, Oct 5, 2016 at 12:11 PM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> > On Wed, Oct 5, 2016 at 11:22 AM, Andres Freund <andres(at)anarazel(dot)de>
wrote:
> >>> Potential use cases for DHT include caches, in-memory database objects
> >>> and working state for parallel execution.
> >>
> >> Is there a more concrete example, i.e. a user we'd convert to this at
> >> the same time as introducing this hashtable?
> >
> > A colleague of mine will shortly post a concrete patch to teach an
> > existing executor node how to be parallel aware, using DHT. I'll let
> > him explain.
> >
> > I haven't looked into whether it would make sense to convert any
> > existing shmem dynahash hash table to use DHT. The reason for doing
> > so would be to move it out to DSM segments and enable dynamically
> > growing. I suspect that the bounded size of things like the hash
> > tables involved in (for example) predicate locking is considered a
> > feature, not a bug, so any such cluster-lifetime core-infrastructure
> > hash table would not be a candidate. More likely candidates would be
> > ephemeral data used by the executor, as in the above-mentioned patch,
> > and long lived caches of dynamic size owned by core code or
> > extensions. Like a shared query plan cache, if anyone can figure out
> > the invalidation magic required.
>
> Another thought: it could be used to make things like
> pg_stat_statements not have to be in shared_preload_libraries.
>

That would indeed be a great improvement. And possibly also allow the
changing of the max number of statements it can track without a restart?

I was also wondering if it might be useful for a replacement for some of
the pgstats stuff to get rid of the cost of spooling to file and then
rebuilding the hash tables in the receiving end. I've been waiting for this
patch to figure out if that's useful. I mean keep the stats collector doing
what it does now over udp, but present the results in shared hash tables
instead of files.

/Magnus

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rushabh Lathia 2016-10-05 06:05:45 Gather Merge
Previous Message Vitaly Burovoy 2016-10-05 05:52:57 Re: Proposal: ON UPDATE REMOVE foreign key action