Re: Global temporary tables

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Global temporary tables
Date: 2019-08-13 05:34:37
Message-ID: CAMsr+YFh5N0TCtaMmUXZ85wJx1EkMwV+cjWWbQvMbtyJsc2ONg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 13 Aug 2019 at 00:47, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:

> But Postgres is not storing this information now anywhere else except
>> statistic, isn't it?
>>
>
> not only - critical numbers are reltuples, relpages from pg_class
>

That's a very good point. relallvisible too. How's the global temp table
impl handling that right now, since you won't be changing the pg_class row?
AFAICS relpages doesn't need to be up to date (and reltuples certainly
doesn't) so presumably you're just leaving them as zero?

What happens right now if you ANALYZE or VACUUM ANALYZE a global temp
table? Is it just disallowed?

I'll need to check, but I wonder if periodically updating those fields in
pg_class impacts logical decoding too. Logical decoding must treat
transactions with catalog changes as special cases where it creates custom
snapshots and does other expensive additional work.
(See ReorderBufferXidSetCatalogChanges in reorderbuffer.c and its
callsites). We don't actually need to know relpages and reltuples during
logical decoding. It can probably ignore relfrozenxid and relminmxid
changes too, maybe even pg_statistic changes though I'd be less confident
about that one.

At some point I need to patch in a bunch of extra tracepoints and do some
perf tracing to see how often we do potentially unnecessary snapshot
related work in logical decoding.

> There was proposal to cache relation size, but it is not implemented yet.
>> If such cache exists, then we can use it to store local information about
>> global temporary tables.
>> So if 99% of users do not perform analyze for temporary tables, then them
>> will not be faced with this problem, right?
>>
>
> they use default statistics based on relpages. But for 1% of applications
> statistics are critical - almost always for OLAP applications.
>

Agreed. It's actually quite a common solution to user problem reports /
support queries about temp table performance: "Run ANALYZE. Consider
creating indexes too."

Which reminds me - if global temp tables do get added, it'll further
increase the desirability of exposing a feature to let users
disable+invalidate and then later reindex+enable indexes without icky
catalog hacking. So they can disable indexes for their local copy, load
data, re-enable indexes. That'd be "interesting" to implement for global
temp tables given that index state is part of the pg_index row associated
with an index rel though.

1. hold these data only in memory in special buffers
>>
>>
I don't see that working well for pg_statistic or anything else that holds
nontrivial user data though.

> 2. hold these data in global temporary tables - it is similar to normal
>> tables - we can use global temp tables for metadata like classic persistent
>> tables are used for metadata of classic persistent tables. Next syscache
>> can be enhanced to work with union of two system tables.
>>
>>
Very meta. Syscache and relcache are extremely performance critical but
could probably skip scans entirely on backends that haven't used any global
temp tables.

I don't know the relevant caches well enough to have a useful opinion here.

> I think that it not possible to assume that temporary data will aways fir
>> in memory.
>> So 1) seems to be not acceptable solution.
>>
>
It'd only be the metadata, but if it includes things like column histograms
and most frequent value data that'd still be undesirable to have pinned in
backend memory.

--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuro Yamada 2019-08-13 05:40:58 Re: [HACKERS] CLUSTER command progress monitor
Previous Message Tatsuro Yamada 2019-08-13 05:33:34 Re: progress report for ANALYZE