Re: Global temporary tables

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Global temporary tables
Date: 2019-08-12 16:47:09
Message-ID: CAFj8pRAJ_fAZrwcp9FgKsS+mGk3g5t2iZnqpcDiPXp=O6DmLFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

po 12. 8. 2019 v 18:19 odesílatel Konstantin Knizhnik <
k(dot)knizhnik(at)postgrespro(dot)ru> napsal:

> Hi,
>
> On 11.08.2019 10:14, Pavel Stehule wrote:
>
>
> Hi
>
>
>> There is one more problem with global temporary tables for which I do not
>> know good solution now: collecting statistic.
>> As far as each backend has its own data, generally them may need
>> different query plans.
>> Right now if you perform "analyze table" in one backend, then it will
>> affect plans in all backends.
>> It can be considered not as bug, but as feature if we assume that
>> distribution if data in all backens is similar.
>> But if this assumption is not true, then it can be a problem.
>>
>
> Last point is probably the most difficult issue and I think about it
> years.
>
> I have a experience with my customers so 99% of usage temp tables is
> without statistics - just with good information only about rows. Only few
> customers know so manual ANALYZE is necessary for temp tables (when it is
> really necessary).
>
> Sharing meta data about global temporary tables can real problem -
> probably not about statistics, but surely about number of pages and number
> of rows.
>
>
> But Postgres is not storing this information now anywhere else except
> statistic, isn't it?
>

not only - critical numbers are reltuples, relpages from pg_class

There was proposal to cache relation size, but it is not implemented yet.
> If such cache exists, then we can use it to store local information about
> global temporary tables.
> So if 99% of users do not perform analyze for temporary tables, then them
> will not be faced with this problem, right?
>

they use default statistics based on relpages. But for 1% of applications
statistics are critical - almost always for OLAP applications.

>
>
> There are two requirements:
>
> a) we need some special meta data for any instance (per session) of global
> temporary table (row, pages, statistics, maybe multicolumn statistics, ...)
>
> b) we would not to use persistent global catalogue (against catalogue
> bloating)
>
> I see two possible solution:
>
> 1. hold these data only in memory in special buffers
>
> 2. hold these data in global temporary tables - it is similar to normal
> tables - we can use global temp tables for metadata like classic persistent
> tables are used for metadata of classic persistent tables. Next syscache
> can be enhanced to work with union of two system tables.
>
> I prefer @2 because changes can be implemented on deeper level.
>
> Sharing metadata for global temp tables (current state if I understand
> well) is good enough for develop stage, but It is hard to expect so it can
> work generally in production environment.
>
>
> I think that it not possible to assume that temporary data will aways fir
> in memory.
> So 1) seems to be not acceptable solution.
>

I spoke only about metadata. Data should be stored in temp buffers (and
possibly in temp files)

Pavel

>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message legrand legrand 2019-08-12 17:01:30 Re: [survey] New "Stable" QueryId based on normalized query text
Previous Message Anastasia Lubennikova 2019-08-12 16:42:55 Re: Removing unneeded downlink field from nbtree stack struct