Re: Postgres with pthread

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Postgres with pthread
Date: 2017-12-06 21:58:58
Message-ID: CAEepm=1T2CTdJu1jNnYH2mCFs-0Rqya0bd-B1FL2ET3p1J3BLA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 7, 2017 at 6:08 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2017-12-06 19:40:00 +0300, Konstantin Knizhnik wrote:
>> As far as I remember, several years ago when implementation of intra-query
>> parallelism was just started there was discussion whether to use threads or
>> leave traditional Postgres process architecture. The decision was made to
>> leave processes. So now we have bgworkers, shared message queue, DSM, ...
>> The main argument for such decision was that switching to threads will
>> require rewriting of most of Postgres code.
>
>> It seems to be quit reasonable argument and and until now I agreed with it.
>>
>> But recently I wanted to check it myself.
>
> I think that's something pretty important to play with. There've been
> several discussions lately, both on and off list / in person, that we're
> taking on more-and-more technical debt just because we're using
> processes. Besides the above, we've grown:
> - a shared memory allocator
> - a shared memory hashtable
> - weird looking thread aware pointers
> - significant added complexity in various projects due to addresses not
> being mapped to the same address etc.

Yes, those are all workarounds for an ancient temporary design choice.
To quote from a 1989 paper[1] "Currently, POSTGRES runs as one process
for each active user. This was done as an expedient to get a system
operational as quickly as possible. We plan on converting POSTGRES to
use lightweight processes [...]". +1 for sticking to the plan.

While personally contributing to the technical debt items listed
above, I always imagined that all that machinery could become
compile-time options controlled with --with-threads and
dsa_get_address() would melt away leaving only a raw pointers, and
dsa_area would forward to the MemoryContext + ResourceOwner APIs, or
something like that. It's unfortunate that we lose type safety along
the way though. (If only there were some way we could write
dsa_pointer<my_type>. In fact it was also a goal of the original
project to adopt C++, based on a comment in 4.2's nodes.h: "Eventually
this code should be transmogrified into C++ classes, and this is more
or less compatible with those things.")

If there were a good way to reserve (but not map) a large address
range before forking, there could also be an intermediate build mode
that keeps the multi-process model but where DSA behaves as above,
which might an interesting way to decouple the
DSA-go-faster-and-reduce-tech-debt project from the threading project.
We could manage the reserved address space ourselves and map DSM
segments with MAP_FIXED, so dsa_get_address() address decoding could
be compiled away. One way would be to mmap a huge range backed with
/dev/zero, and then map-with-MAP_FIXED segments over the top of it and
then remap /dev/zero back into place when finished, but that sucks
because it gives you that whole mapping in your core files and relies
on overcommit which we don't like, hence my interest in a way to
reserve but not map.

>> The first problem with porting Postgres to pthreads is static variables
>> widely used in Postgres code.
>> Most of modern compilers support thread local variables, for example GCC
>> provides __thread keyword.
>> Such variables are placed in separate segment which is address through
>> segment register (at Intel).
>> So access time to such variables is the same as to normal static variables.
>
> I experimented similarly. Although I'm not 100% sure that if were to go
> for it, we wouldn't instead want to abstract our session concept
> further, or well, at all.

Using a ton of thread local variables may be a useful stepping stone,
but if we want to be able to separate threads/processes from sessions
eventually then I guess we'll want to model sessions as first class
objects and pass them around explicitly or using a single TLS variable
current_session.

> I think the biggest problem with doing this for real is that it's a huge
> project, and that it'll take a long time.
>
> Thanks for working on this!

+1

[1] http://db.cs.berkeley.edu/papers/ERL-M90-34.pdf

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2017-12-06 22:34:39 Re: [HACKERS] Transaction control in procedures
Previous Message Justin Pryzby 2017-12-06 21:46:52 Re: Bitmap scan is undercosted? - overestimated correlation and cost_index