Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

From: knizhnik <knizhnik(at)garret(dot)ru>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, James Mansion <james(at)mansionfamily(dot)plus(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
Date: 2014-01-09 19:18:39
Message-ID: 52CEF60F.9070206@garret.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-announce pgsql-hackers

On 01/09/2014 09:22 PM, Robert Haas wrote:
> On Wed, Jan 8, 2014 at 2:39 PM, knizhnik <knizhnik(at)garret(dot)ru> wrote:
>> I wonder what is the intended use case of dynamic shared memory?
>> Is is primarly oriented on PostgreSQL extensions or it will be used also in
>> PosatgreSQL core?
> My main motivation is that I want to use it to support parallel query.
> There is unfortunately quite a bit of work left to be done before we
> can make that a reality, but that's the goal.

I do not want to waste your time, but this topic is very interesting to
me and I will be very pleased if you drop few words about how DSM can
help to implement parallel query processing?
It seems to me that the main complexity is in optimizer - it needs to
split query plan into several subplans which can be executed
concurrently and then merge their partial results.
As far as I understand it is not possible to use multithreading for
parallel query execution because most of PostgreSQL code is
non-reentrant. So we need to execute this subplans by several processes.
And unlike threads, the only way of efficient exchanging data between
processes is shared memory. So it is clear why do we need shared memory
for parallel query execution. But why it has to be dynamic? Why it can
not be preallocated at start time as most of other resources used by
PostgreSQL?

>
>> May be I am wrong, but I do not see some reasons for creating multiple DSM
>> segments by the same extension.
> Right.
>
>> And total number of DSM segments is expected to be not very large (<10). The
>> same is true for synchronization primitives (LWLocks for example) needed to
>> synchronize access to this DSM segments. So I am not sure if possibility to
>> place locks in DSM is really so critical...
>> We can just reserved some space for LWLocks which can be used by extension,
>> so that LWLockAssign() can be used without RequestAddinLWLocks or
>> RequestAddinLWLocks can be used not only from preloaded extension.
> If you're doing all of this at postmaster startup time, that all works
> fine. If you want to be able to load up an extension on the fly, then
> it doesn't. You can only RequestAddinLWLocks() at postmaster start
> time, not afterwards, so currently any extension that wants to use
> lwlocks has to be loaded at postmaster startup time, or you're out of
> luck.
>
> Well. Technically we reserve something like 3 extra lwlocks that
> could be assigned later. But relying on those to be available is not
> very reliable, and also, 3 is not very many, considering that we have
> something north of 32k core lwlocks in the default configuration.

3 is definitely too small.
But you agreed with me that number of DSM segments will be not very large.
And if we do not need fine grain locking (and IMHO it is not needed for
most extensions), then we need just few (most likely one) lock per DSM
segment.
It means that if instead of 3 we reserve let's say 30 LW-locks, then it
will be enough for most extensions. And there will be almost now extra
resources overhead, because as you wrote PostgreSQL has 32k locks in
default configuration.

Certainly if we need independent lock for each page of DSM memory than
there will be no other choice except placing locks in DSM segment
itself. But once again - I do not think that most of extension needed
shared memory will use such fine grain locking.

In response to

Responses

Browse pgsql-announce by date

  From Date Subject
Next Message knizhnik 2014-01-09 19:24:59 Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
Previous Message Amit Kapila 2014-01-09 19:09:25 Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

Browse pgsql-hackers by date

  From Date Subject
Next Message knizhnik 2014-01-09 19:24:59 Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
Previous Message Josh Berkus 2014-01-09 19:11:45 Re: nested hstore patch