Re: Let's make PostgreSQL multi-threaded

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Jeremy Schneider <schneider(at)ardentperf(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Thomas Kellerer <shammat(at)gmx(dot)net>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Let's make PostgreSQL multi-threaded
Date: 2023-06-08 10:37:37
Message-ID: 3d8ffaa5-b9a1-9538-9ac3-ffa751449f4b@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/8/23 01:37, Thomas Munro wrote:
> On Thu, Jun 8, 2023 at 10:37 AM Jeremy Schneider
> <schneider(at)ardentperf(dot)com> wrote:
>> On 6/7/23 2:39 PM, Thomas Kellerer wrote:
>>> Tomas Vondra schrieb am 07.06.2023 um 21:20:
>>>> Also, which other projects did this transition? Is there something we
>>>> could learn from them? Were they restricted to much smaller list of
>>>> platforms?
>>>
>>> Not open source, but Oracle was historically multi-threaded on Windows
>>> and multi-process on all other platforms.
>>> I _think_ starting with 19c you can optionally run it multi-threaded on
>>> Linux as well.
>> Looks like it actually became publicly available in 12c. AFAICT Oracle
>> supports both modes today, with a config parameter to switch between them.
>
> It's old, but this describes the 4 main models and which well known
> RDBMSes use them in section 2.3:
>
> https://dsf.berkeley.edu/papers/fntdb07-architecture.pdf
>
> TL;DR DB2 is the winner, it can do process-per-connection,
> thread-per-connection, process-pool or thread-pool.
>

I think the basic architectures are known, especially from the user
perspective. I'm more interested in challenges the projects faced while
moving from one architecture to the other, or how / why they support
more than just one, etc.

In [1] Heikki argued that:

I don't think this is worth it, unless we plan to eventually remove
the multi-process mode. ... As long as you need to also support
processes, you need to code to the lowest common denominator and
don't really get the benefits.

But these projects clearly support multiple architectures, and have no
intention to ditch some of them. So how did they do that? Surely they
think there are benefits.

One option would be to just have separate code paths for processes and
threads, but the effort required to maintain and improve that would be
deadly. So the only feasible option seems to be they managed to abstract
the subsystems enough for the "regular" code to not care about model.

[1]
https://www.postgresql.org/message-id/6e3082dc-ff29-9cbf-847e-5f570828b46b@iki.fi

> I understand this thread to be about thread-per-connection (= backend,
> session, socket) for now.

Maybe, although people also proposed to switch the parallel query to
threads (so that'd be multiple threads per session). But I don't think
it really matters, the concerns are mostly about moving from one
architecture to another and/or supporting both.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pradeep Kumar 2023-06-08 11:08:39 Seeking Guidance on Using Valgrind in PostgreSQL for Detecting Memory Leaks in Extension Code
Previous Message Etsuro Fujita 2023-06-08 10:36:48 Re: postgres_fdw: wrong results with self join + enable_nestloop off