Re: Postgres with pthread

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Postgres with pthread
Date: 2017-12-21 13:46:12
Message-ID: CAFj8pRB39Hyeu-1wohOzx6icH8+dS0dNdqc5Pi-JRYS=aj50Pw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2017-12-21 14:25 GMT+01:00 Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>:

> I continue experiments with my pthread prototype.
> Latest results are the following:
>
> 1. I have eliminated all (I hope) calls of non-reentrant functions
> (getopt, setlocale, setitimer, localtime, ...). So now parallel tests are
> passed.
>
> 2. I have implemented deallocation of top memory context (at thread exit)
> and cleanup of all opened file descriptors.
> I have to replace several place where malloc is used with top_malloc:
> allocation in top context.
>
> 3. Now my prototype is passing all regression tests now. But handling of
> errors is still far from completion.
>
> 4. I have performed experiments with replacing synchronization primitives
> used in Postgres with pthread analogues.
> Unfortunately it has almost now influence on performance.
>
> 5. Handling large number of connections.
> The maximal number of postgres connections is almost the same: 100k.
> But memory footprint in case of pthreads was significantly smaller: 18Gb
> vs 38Gb.
> And difference in performance was much higher: 60k TPS vs . 600k TPS.
> Compare it with performance for 10k clients: 1300k TPS.
> It is read-only pgbench -S test with 1000 connections.
> As far as pgbench doesn't allow to specify more than 1000 clients, I
> spawned several instances of pgbench.
>
> Why handling large number of connections is important?
> It allows applications to access postgres directly, not using pgbouncer or
> any other external connection pooling tool.
> In this case an application can use prepared statements which can reduce
> speed of simple queries almost twice.
>

What I know MySQL has not good experience with high number of threads - and
there is thread pool in enterprise (and now in Mariadb0 versions.

Regards

Pavel

> Unfortunately Postgres sessions are not lightweight. Each backend
> maintains its private catalog and relation caches, prepared statement
> cache,...
> For real database size of this caches in memory will be several megabytes
> and warming this caches can take significant amount of time.
> So if we really want to support large number of connections, we should
> rewrite caches to be global (shared).
> It will allow to save a lot of memory but add synchronization overhead.
> Also at NUMA private caches may be more efficient than one global cache.
>
> My proptotype can be found at: git://github.com/postgrespro/p
> ostgresql.pthreads.git
>
>
> --
>
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2017-12-21 13:46:15 Re: Basebackups reported as idle
Previous Message Andrew Dunstan 2017-12-21 13:45:18 Re: Reproducible builds: genbki.pl and Gen_fmgrtab.pl