Re: [Proposal] Global temporary tables

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, 蔡松露(子嘉) <zijia(at)taobao(dot)com>, "Cai, Le" <le(dot)cai(at)alibaba-inc(dot)com>, 张广舟(明虚) <guangzhou(dot)zgz(at)alibaba-inc(dot)com>, 赵殿奎 <diankui(dot)zdk(at)alibaba-inc(dot)com>, 萧少聪(铁庵) <shaocong(dot)xsc(at)alibaba-inc(dot)com>
Subject: Re: [Proposal] Global temporary tables
Date: 2019-11-02 08:56:09
Message-ID: CAOBaU_YG=54byWLUJopeokDJR_m+8GkMtN1WC=+iC2jeOLKmRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Nov 2, 2019 at 8:23 AM Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>
> so 2. 11. 2019 v 8:18 odesílatel Julien Rouhaud <rjuju123(at)gmail(dot)com> napsal:
>>
>> On Sat, Nov 2, 2019 at 6:31 AM Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> >
>> > pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> napsal:
>> >>
>> >> On 01.11.2019 18:26, Robert Haas wrote:
>> >> > On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
>> >> > <k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
>> >> >> It seems to me that I have found quite elegant solution for per-backend statistic for GTT: I just inserting it in backend's catalog cache, but not in pg_statistic table itself.
>> >> >> To do it I have to add InsertSysCache/InsertCatCache functions which insert pinned entry in the correspondent cache.
>> >> >> I wonder if there are some pitfalls of such approach?
>> >> > That sounds pretty hackish. You'd have to be very careful, for
>> >> > example, that if the tables were dropped or re-analyzed, all of the
>> >> > old entries got removed --
>> >>
>> >> I have checked it:
>> >> - when table is reanalyzed, then cache entries are replaced.
>> >> - when table is dropped, then cache entries are removed.
>> >>
>> >> > and then it would still fail if any code
>> >> > tried to access the statistics directly from the table, rather than
>> >> > via the caches. My assumption is that the statistics ought to be
>> >> > stored in some backend-private data structure designed for that
>> >> > purpose, and that the code that needs the data should be taught to
>> >> > look for it there when the table is a GTT.
>> >>
>> >> Yes, if you do "select * from pg_statistic" then you will not see
>> >> statistic for GTT in this case.
>> >> But I do not think that it is so critical. I do not believe that anybody
>> >> is trying to manually interpret values in this table.
>> >> And optimizer is retrieving statistic through sys-cache mechanism and so
>> >> is able to build correct plan in this case.
>> >
>> >
>> > Years ago, when I though about it, I wrote patch with similar design. It's working, but surely it's ugly.
>> >
>> > I have another idea. Can be pg_statistics view instead a table?
>> >
>> > Some like
>> >
>> > SELECT * FROM pg_catalog.pg_statistics_rel
>> > UNION ALL
>> > SELECT * FROM pg_catalog.pg_statistics_gtt();
>> >
>> > Internally - when stat cache is filled, then there can be used pg_statistics_rel and pg_statistics_gtt() directly. What I remember, there was not possibility to work with queries, only with just relations.
>>
>> It'd be a loss if you lose the ability to see the statistics, as there
>> are valid use cases where you need to see the stats, eg. understanding
>> why you don't get the plan you wanted. There's also at least one
>> extension [1] that allows you to backup and use restored statistics,
>> so there are definitely people interested in it.
>>
>> [1]: https://github.com/ossc-db/pg_dbms_stats
>
>
> I don't think - the extensions can use UNION and the content will be same as caches used by planner.

Yes, I agree that changing pg_statistics to be a view as you showed
would fix the problem. I was answering Konstantin's point:

>> >> But I do not think that it is so critical. I do not believe that anybody
>> >> is trying to manually interpret values in this table.
>> >> And optimizer is retrieving statistic through sys-cache mechanism and so
>> >> is able to build correct plan in this case.

which is IMHO a wrong assumption.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Adrien Nayrat 2019-11-02 09:23:49 Re: Adding percentile metrics to pg_stat_statements module
Previous Message Michael Paquier 2019-11-02 08:19:11 Re: update ALTER TABLE with ATTACH PARTITION lock mode (docs)