Re: Let's make PostgreSQL multi-threaded

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: knizhnik(at)garret(dot)ru
Cc: pashkin(dot)elfe(at)gmail(dot)com, dilipbalaut(at)gmail(dot)com, hannuk(at)google(dot)com, hlinnaka(at)iki(dot)fi, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Let's make PostgreSQL multi-threaded
Date: 2023-06-13 08:46:58
Message-ID: 20230613.174658.548424684295647548.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 13 Jun 2023 11:20:56 +0300, Konstantin Knizhnik <knizhnik(at)garret(dot)ru> wrote in
>
>
> On 13.06.2023 10:55 AM, Kyotaro Horiguchi wrote:
> > At Tue, 13 Jun 2023 09:55:36 +0300, Konstantin Knizhnik
> > <knizhnik(at)garret(dot)ru> wrote in
> >> Postgres backend is "thick" not because of large number of local
> >> variables.
> >> It is because of local caches: catalog cache, relation cache, prepared
> >> statements cache,...
> >> If they are not rewritten, then backend still may consume a lot of
> >> memory even if it will be thread rather then process.
> >> But threads simplify development of global caches, although it can be
> >> done with DSM.
> > With the process model, that local stuff are flushed out upon
> > reconnection. If we switch to the thread model, we will need an
> > expiration mechanism for those stuff.
>
> We already have invalidation mechanism. It will be also used in case
> of shared cache, but we do not need to send invalidations to all
> backends.

Invalidation is not expiration.

> I do not completely understand your point.
> Right now caches (for example catalog cache) is not limited at all.
> So if you have very large database schema, then this cache will
> consume a lot of memory (multiplied by number of
> backends). The fact that it is flushed out upon reconnection can not
> help much: what if backends are not going to disconnect?

Right now, if one out of many backends creates a huge system catalog
cahce, it can be cleard upon disconnection. The same client can
repeat this process, but users can ensure such situations don't
persist. However, with the thread model, we won't be able to clear
parts of the cache that aren't required by the active backends
anymore. (Of course with threads, we can avoid duplications, though.)

> In case of shared cache we will have to address the same problem:
> whether this cache should be limited (with some replacement discipline
> as LRU).
> Or it is unlimited. In case of shared cache, size of the cache is less
> critical because it is not multiplied by number of backends.

Yes.

> So we can assume that catalog  and relation cache should always fir in
> memory (otherwise significant rewriting of all Postgtres code working
> with relations will be needed).

I'm not sure that is ture.. But likely to be?

> But Postgres also have temporary tables. For them we may need local
> backend cache in any case.
> Global temp table patch was not approved so we still have to deal with
> this awful temp tables.
>
> In any case I do not understand why do we need some expiration
> mechanism for this caches.

I don't think it is efficient that PostgreSQL to consume a large
amount of memory for seldom-used content. While we may not need
expiration mechanism for moderate use cases, I have observed instances
where a single process hogs a significant amount of memory,
particularly for intermittent tasks.

> If there is some relation than information about this relation should
> be kept in the cache as long as this relation is alive.
> If there is not enough memory to cache information about all
> relations, then we may need some replacement algorithm.
> But I do not think that there is any sense to remove some item fro the
> cache just because it is too old.

Ah. I see. I am fine with a replacement mechanishm. But the evicition
algorithm seems almost identical to the exparation algorithm. The
algorithm will not be simply driven by object age, but I'm not sure we
need more than access frequency.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-06-13 08:55:50 Re: Non-superuser subscription owners
Previous Message Masahiko Sawada 2023-06-13 08:35:45 Re: Skip collecting decoded changes of already-aborted transactions