Postgres with pthread

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Postgres with pthread
Date: 2017-12-06 16:40:00
Message-ID: 9defcb14-a918-13fe-4b80-a0b02ff85527@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi hackers,

As far as I remember, several years ago when implementation of
intra-query parallelism was just started there was discussion whether to
use threads or leave traditional Postgres process architecture. The
decision was made to leave processes. So now we have bgworkers, shared
message queue, DSM, ...
The main argument for such decision was that switching to threads will
require rewriting of most of Postgres code.
It seems to be quit reasonable argument and and until now I agreed with it.

But recently I wanted to check it myself.
The first problem with porting Postgres to pthreads is static variables
widely used in Postgres code.
Most of modern compilers support thread local variables, for example GCC
provides __thread keyword.
Such variables are placed in separate segment which is address through
segment register (at Intel).
So access time to such variables is the same as to normal static variables.

Certainly may be not all compilers have builtin support of TLS and may
be not at all hardware platforms them are implemented ias efficiently as
at Intel.
So certainly such approach decreases portability of Postgres. But IMHO
it is not so critical.

What I have done:
1. Add session_local (defined as __thread) to definition of most of
static and global variables.
I leaved some variables pointed to shared memory as static. Also I have
to changed initialization of some static variables,
because address of TLS variable can not be used in static initializers.
2. Change implementation of GUCs to make them thread specific.
3. Replace fork() with pthread_create
4. Rewrite file descriptor cache to be global (shared by all threads).

I have not changed all Postgres synchronization primitives and shared
memory.
It took me about one week of work.

What is  not done yet:
1. Handling of signals (I expect that Win32 code can be somehow reused
here).
2. Deallocation of memory and closing files on backend (thread) termination.
3. Interaction of postmaster and backends with PostgreSQL auxiliary
processes (threads), such as autovacuum, bgwriter, checkpointer, stat
collector,...

What are the advantages of using threads instead of processes?

1. No need to use shared memory. So there is no static limit for amount
of memory which can be used by Postgres. No need in distributed shared
memory and other stuff designed to share memory between backends and
bgworkers.
2. Threads significantly simplify implementation of parallel algorithms:
interaction and transferring data between threads can be done easily and
more efficiently.
3. It is possible to use more efficient/lightweight synchronization
primitives. Postgres now mostly relies on its own low level
sync.primitives which user-level implementation
is using spinlocks and atomics and then fallback to OS semaphores/poll.
I am not sure how much gain can we get by replacing this primitives with
one optimized for threads.
My colleague from Firebird community told me that just replacing
processes with threads can obtain 20% increase of performance, but it is
just first step and replacing sync. primitive
can give much greater advantage. But may be for Postgres with its low
level primitives it is not true.
4. Threads are more lightweight entities than processes. Context switch
between threads takes less time than between process. And them consume
less memory. It is usually possible to spawn more threads than processes.
5. More efficient access to virtual memory. As far as all threads are
sharing the same memory space, TLB is used much efficiently in this case.
6. Faster backend startup. Certainly starting backend at each user's
request is bad thing in any case. Some kind of connection pooling should
be used in any case to provide acceptable performance. But in any case,
start of new backend process in postgres causes a lot of page faults
which have dramatical impact on performance. And there is no such
problem with threads.

Certainly, processes are also having some advantages comparing with threads:
1. Better isolation and error protection
2. Easier error handling
3. Easier control of used resources

But it is a theory. The main idea of this prototype was to prove or
disprove this expectation at practice.
I didn't expect large differences in performance because synchronization
primitives are not changed and I performed my experiments at Linux where
threads/processes are implemented in similar way.

Below are some results (1000xTPS) of select-only (-S) pgbench with scale
100 at my desktop with quad-core i7-4770 3.40GHz and 16Gb of RAM:

Connections    Vanilla/default       Vanilla/prepared
pthreads/defaultpthreads/prepared
10                    100 191                      
106                         207
100                  67 131                      
105                         168
1000                41 65                        
55                           102

As you can see, for small number of connection results are almost
similar. But for large number of connection pthreads provide less
degradation.

You can look at my prototype here:
https://github.com/postgrespro/postgresql.pthreads.git

But please notice that it is very raw prototype. A lot of stuff is not
working yet. And supporting all of exited Postgres functionality requires
much more efforts (and even more efforts are needed for optimizing
Postgres for this architecture).

I just want to receive some feedback and know if community is interested
in any further work in this direction.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-12-06 16:53:21 Re: Postgres with pthread
Previous Message Andrew Dunstan 2017-12-06 16:26:03 Re: ALTER TABLE ADD COLUMN fast default