Re: [Proposal] Global temporary tables

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, 蔡松露(子嘉) <zijia(at)taobao(dot)com>, "Cai, Le" <le(dot)cai(at)alibaba-inc(dot)com>, 张广舟(明虚) <guangzhou(dot)zgz(at)alibaba-inc(dot)com>, 赵殿奎 <diankui(dot)zdk(at)alibaba-inc(dot)com>, 萧少聪(铁庵) <shaocong(dot)xsc(at)alibaba-inc(dot)com>
Subject: Re: [Proposal] Global temporary tables
Date: 2019-11-02 15:15:50
Message-ID: 2f124628-a15d-d423-e0c1-a84da16ae30b@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02.11.2019 8:30, Pavel Stehule wrote:
>
>
> pá 1. 11. 2019 v 17:09 odesílatel Konstantin Knizhnik
> <k(dot)knizhnik(at)postgrespro(dot)ru <mailto:k(dot)knizhnik(at)postgrespro(dot)ru>> napsal:
>
>
>
> On 01.11.2019 18:26, Robert Haas wrote:
> > On Fri, Nov 1, 2019 at 11:15 AM Konstantin Knizhnik
> > <k(dot)knizhnik(at)postgrespro(dot)ru <mailto:k(dot)knizhnik(at)postgrespro(dot)ru>>
> wrote:
> >> It seems to me that I have found quite elegant solution for
> per-backend statistic for GTT: I just inserting it in backend's
> catalog cache, but not in pg_statistic table itself.
> >> To do it I have to add InsertSysCache/InsertCatCache functions
> which insert pinned entry in the correspondent cache.
> >> I wonder if there are some pitfalls of such approach?
> > That sounds pretty hackish. You'd have to be very careful, for
> > example, that if the tables were dropped or re-analyzed, all of the
> > old entries got removed --
>
> I have checked it:
> - when table is reanalyzed, then cache entries are replaced.
> - when table is dropped, then cache entries are removed.
>
> > and then it would still fail if any code
> > tried to access the statistics directly from the table, rather than
> > via the caches. My assumption is that the statistics ought to be
> > stored in some backend-private data structure designed for that
> > purpose, and that the code that needs the data should be taught to
> > look for it there when the table is a GTT.
>
> Yes, if you do "select * from pg_statistic" then you will not see
> statistic for GTT in this case.
> But I do not think that it is so critical. I do not believe that
> anybody
> is trying to manually interpret values in this table.
> And optimizer is retrieving statistic through sys-cache mechanism
> and so
> is able to build correct plan in this case.
>
>
> Years ago, when I though about it, I wrote patch with similar design.
> It's working, but surely it's ugly.
>
> I have another idea. Can be pg_statistics view instead a table?
>
> Some like
>
> SELECT * FROM pg_catalog.pg_statistics_rel
> UNION ALL
> SELECT * FROM pg_catalog.pg_statistics_gtt();

And pg_catalog.pg_statistics_gtt() is set returning functions?
I afraid that it is not acceptable solution from performance point of
view: pg_statictic table is accessed by keys (<relid>,<attpos>,<inh>)
If it can not be done using index scan, then it can cause significant
performance slow down.

>
> Internally - when stat cache is filled, then there can be used
> pg_statistics_rel and pg_statistics_gtt() directly. What I remember,
> there was not possibility to work with queries, only with just relations.
>
> Or crazy idea - today we can implement own types of heaps. Is possible
> to create engine where result can be combination of some shared data
> and local data. So union will be implemented on heap level.
> This implementation can be simple, just scanning pages from shared
> buffers and from local buffers. For these tables we don't need complex
> metadata. It's crazy idea, and I think so union with table function
> should be best.

Frankly speaking, implementing special heap access method for
pg_statistic just to handle case of global temp tables seems to be overkill
from my point of view. It requires a lot coding (or at least copying a
lot of code from heapam). Also, as I wrote above, we need also index for
efficient lookup of statistic.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-11-02 15:26:59 Re: Remove configure --disable-float4-byval and --disable-float8-byval
Previous Message Konstantin Knizhnik 2019-11-02 15:09:42 Re: [Proposal] Global temporary tables