Re: Global temporary tables

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Global temporary tables
Date: 2019-08-08 02:40:29
Message-ID: CAMsr+YFahmjG0PaFPL-J7Zxa4cgt_PFtgSNV7nRNgCE=29cvcw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 6 Aug 2019 at 16:32, Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
wrote:

> New version of the patch with several fixes is attached.
> Many thanks to Roman Zharkov for testing.
>

FWIW I still don't understand your argument with regards to using
shared_buffers for temp tables having connection pooling benefits. Are you
assuming the presence of some other extension in your extended version of
PostgreSQL ? In community PostgreSQL a temp table's contents in one backend
will not be visible in another backend. So if your connection pooler in
transaction pooling mode runs txn 1 on backend 42 and it populates temp
table X, then the pooler runs the same app session's txn 2 on backend 45,
the contents of temp table X are not visible anymore.

Can you explain? Because AFAICS so long as temp table contents are
backend-private there's absolutely no point ever using shared buffers for
their contents.

Perhaps you mean that in a connection pooling case, each backend may land
up filling up temp buffers with contents from *multiple different temp
tables*? If so, sure, I get that, but the answer there seems to be to
improve eviction and memory accounting, not make backends waste precious
shared_buffers space on non-shareable data.

Anyhow, I strongly suggest you simplify the feature to add the basic global
temp table feature so the need to change pg_class, pg_attribute etc to use
temp tables is removed, but separate changes to temp table memory handling
etc into a follow-up patch. That'll make it smaller and easier to review
and merge too. The two changes are IMO logically quite separate anyway.

Come to think of it, I think connection poolers might benefit from an
extension to the DISCARD command, say "DISCARD TEMP_BUFFERS", which evicts
temp table buffers from memory *without* dropping the temp tables. If
they're currently in-memory tuplestores they'd be written out and evicted.
That way a connection pooler could "clean" the backend, at the cost of some
probably pretty cheap buffered writes to the system buffer cache. The
kernel might not even bother to write out the buffercache and it won't be
forced to do so by fsync, checkpoints, etc, nor will the writes go via WAL
so such evictions could be pretty cheap - and if not under lots of memory
pressure the backend could often read the temp table back in from system
buffer cache without disk I/O.

That's my suggestion for how to solve your pooler problem, assuming I've
understood it correctly.

Along these lines I suggest adding the following to DISCARD at some point,
obviously not as part of your patch:

* DISCARD TEMP_BUFFERS
* DISCARD SHARED_BUFFERS
* DISCARD TEMP_FILES
* DISCARD CATALOG_CACHE
* DISCARD HOLD_CURSORS
* DISCARD ADVISORY_LOCKS

where obviously DISCARD SHARED_BUFFERS would be superuser-only and evict
only clean buffers.

(Also, if we extend DISCARD lets also it to be written as DISCARD (LIST,
OF, THINGS, TO, DISCARD) so that we can make the syntax extensible for
plugins in future).

Thoughts?

Would DISCARD TEMP_BUFFERS meet your needs?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ryan Lambert 2019-08-08 02:48:59 Re: Built-in connection pooler
Previous Message Justin Pryzby 2019-08-08 01:24:36 Re: crash 11.5~ (and 11.4)