Re: Let's make PostgreSQL multi-threaded

From: David Geier <geidav(dot)pg(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Let's make PostgreSQL multi-threaded
Date: 2023-08-25 12:01:23
Message-ID: c8300886-9353-69de-6b62-861cbc484e1b@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 8/11/23 14:05, Merlin Moncure wrote:
> On Thu, Jul 27, 2023 at 8:28 AM David Geier <geidav(dot)pg(at)gmail(dot)com> wrote:
>
> Hi,
>
> On 6/7/23 23:37, Andres Freund wrote:
> > I think we're starting to hit quite a few limits related to the
> process model,
> > particularly on bigger machines. The overhead of cross-process
> context
> > switches is inherently higher than switching between threads in
> the same
> > process - and my suspicion is that that overhead will continue to
> > increase. Once you have a significant number of connections we
> end up spending
> > a *lot* of time in TLB misses, and that's inherent to the
> process model,
> > because you can't share the TLB across processes.
>
> Another problem I haven't seen mentioned yet is the excessive kernel
> memory usage because every process has its own set of page table
> entries
> (PTEs). Without huge pages the amount of wasted memory can be huge if
> shared buffers are big.
>
>
> Hm, noted this upthread, but asking again, does this
> help/benefit interactions with the operating system make oom kill
> situations less likely?   These things are the bane of my existence,
> and I'm having a hard time finding a solution that prevents them other
> than running pgbouncer and lowering max_connections, which adds
> complexity.  I suspect I'm not the only one dealing with this. 
>  What's really scary about these situations is they come without
> warning.  Here's a pretty typical example per sar -r.
>
> The conjecture here is that lots of idle connections make the server
> appear to have less memory available than it looks, and sudden
> transient demands can cause it to destabilize.

It does in the sense that your server will have more memory available in
case you have many long living connections around. Every connection has
less kernel memory overhead if you will. Of course even then a runaway
query will be able to invoke the OOM killer. The unfortunate thing with
the OOM killer is that, in my experience, it often kills the
checkpointer. That's because the checkpointer will touch all of shared
buffers over time which makes it likely to get selected by the OOM
killer. Have you tried disabling memory overcommit?

--
David Geier
(ServiceNow)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2023-08-25 12:10:36 Re: persist logical slots to disk during shutdown checkpoint
Previous Message Alvaro Herrera 2023-08-25 12:00:41 Re: cataloguing NOT NULL constraints