Re: Let's make PostgreSQL multi-threaded

From: James Addison <jay(at)jp-hosting(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Hannu Krosing <hannuk(at)google(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Let's make PostgreSQL multi-threaded
Date: 2023-06-14 19:15:37
Message-ID: CALDQ5Nxxj_9Yddo-0XrmxHJdHqaJf4jj=4Y4DQiXwN-Ci5HqDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 12 Jun 2023 at 20:24, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2023-06-12 16:23:14 +0400, Pavel Borisov wrote:
> > Is the following true or not?
> >
> > 1. If we switch processes to threads but leave the amount of session
> > local variables unchanged, there would be hardly any performance gain.
>
> False.
>
>
> > 2. If we move some backend's local variables into shared memory then
> > the performance gain would be very near to what we get with threads
> > having equal amount of session-local variables.
>
> False.
>
>
> > In other words, the overall goal in principle is to gain from less
> > memory copying wherever it doesn't add the burden of locks for
> > concurrent variables access?
>
> False.
>
> Those points seems pretty much unrelated to the potential gains from switching
> to a threading model. The main advantages are:

I think that they're practical performance-related questions about the
benefits of performing a technical migration that could involve
significant development time, take years to complete, and uncover
problems that cause reliability issues for a stable, proven database
management system.

> 1) We'd gain from being able to share state more efficiently (using normal
> pointers) and more dynamically (not needing to pre-allocate). That'd remove
> a good amount of complexity. As an example, consider the work we need to do
> to ferry tuples from one process to another. Even if we just continue to
> use shm_mq, in a threading world we could just put a pointer in the queue,
> but have the tuple data be shared between the processes etc.
>
> Eventually this could include removing the 1:1 connection<->process/thread
> model. That's possible to do with processes as well, but considerably
> harder.

This reads like a code quality argument: that's worthwhile, but I
don't see how it supports your 'False' assertions. Do two queries
running in separate processes spend much time allocating and waiting
on resources that could be shared within a single thread?

> 2) Making context switches cheaper / sharing more resources at the OS and
> hardware level.

That seems valid. Even so, I would expect that for many queries, I/O
access and row processing time is the bulk of the work, and that
context-switches to/from other query processes is relatively
negligible.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-06-14 19:32:13 Re: [PATCH] Missing dep on Catalog.pm in meson rules
Previous Message Robert Haas 2023-06-14 18:46:48 trying again to get incremental backup