Re: Implementation of global temporary tables?

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Zhaomo Yang <zhy001(at)cs(dot)ucsd(dot)edu>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Implementation of global temporary tables?
Date: 2015-07-08 07:53:53
Message-ID: CAFj8pRDA6A3r5HbtMe_08rq+ypAVq=t0ZSwBZaF3MUHu=oAzMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

2015-07-08 9:08 GMT+02:00 Zhaomo Yang <zhy001(at)cs(dot)ucsd(dot)edu>:

> > more global temp tables are little bit comfortable for developers,
> I'd like to emphasize this point. This feature does much more than saving
> a developer from issuing a CREATE TEMP TABLE statement in every session.
> Here are two common use cases and I'm sure there are more.
>
> (1)
> Imagine in a web application scenario, a developer wants to cache some
> session information in a temp table. What's more, he also wants to specify
> some rules which reference the session information. Without this feature,
> the rules will be removed at the end of every session since they depend on
> a temporary object. Global temp tables will allow the developer to define
> the temp table and the rules once.
>
> (2)
> The second case is mentioned by Tom Lane back in 2010 in a thread about
> global temp tables.
> (http://www.postgresql.org/message-id/9319.1272130283@sss.pgh.pa.us)
> "The context that I've seen it come up in is that people don't want to
> clutter their functions with
> create-it-if-it-doesn't-exist logic, which you have to have given the
> current behavior of temp tables."
>
> > 2.a - using on demand created temp tables - most simple solution, but
> > doesn't help with catalogue bloating
>
> I've read the thread and people disapprove this approach because of the
> potential catalog bloat. However, I'd like to champion it. Indeed, this
> approach may have a bloat issue. But for users who needs global temp
> tables, they now have to create a new temp table in every session, which
> means they already have the bloat problem and presumably they have some
> policy to deal with it. In other words, implementing global temp tables by
> this approach gives users the same performance, plus the convenience the
> feature brings.
>
> The root problem here is that whether "whether having the unoptimized
> feature is better than
> having no feature at all". Actually, there was a very similar discussion
> back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom
> Lane's arguments here.
>
> Kevin Grittner's argument:
>
> http://www.postgresql.org/message-id/49F82AEA.EE98.0025.0@wicourts.gov
> "... If you're saying we can implement the standard's global temporary
> tables in a way that performs better than current temporary tables, that's
> cool. That would be a nice "bonus" in addition to the application
> programmer convenience and having another tick-mark on the standards
> compliance charts. Do you think that's feasible? If not, the feature
> would be useful to some with the same performance that temporary tables
> currently provide."
>
> Tom Lane's arguments:
>
> http://www.postgresql.org/message-id/24110.1241035178@sss.pgh.pa.us
> "I'm all for eliminating catalog overheads, if we can find a way to do
> that. I don't think that you get to veto implementation of the feature
> until we can find a way to optimize it better. The question is not about
> whether having the optimization would be better than not having it --- it's
> about whether having the unoptimized feature is better than having no
> feature at all (which means people have to implement the same behavior by
> hand, and they'll *still* not get the optimization)."
>
> There have been several threads here discussing global temp table since
> 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the
> metadata of the session copy in the catalog. However, it seems that none of
> them has been implemented, or even has a feasible design. So why don't we
> implement it in a unoptimized way first?
>

I am not sure, if it is not useless work.

Now, I am thinking so best implementation of global temp tables is
enhancing unlogged tables to have local content. All local data can be
saved in session memory. Usually it is less than 2KB with statistic, and
you don't need to store it in catalogue. When anybody is working with any
table, related data are copied to system cache - and there can be injected
a implementation of global temp tables.

regards

Pavel Stehule

>
> > Is there still interest about this feature?
> I'm very interested in this feature. I'm thinking about one implementation
> which is similar to Pavel's 2009 proposal (
> http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com).
> Here are the major ideas of my design:
>
> (1)
> Creating the cross-session persistent schema as a regular table and
> creating session-private temp tables when a session first accesses it.
>
> (2)
> For DML queries, The global temp table is overloaded by its session copy
> after the relation is opened by an oid or a rangevar. For DDL queries,
> which copy is used depends on whether the query needs to access the data or
> metadata of the global temp table.
>
> There are more differences between this design and Pavel's 2009 proposal
> and I'd like to send a detailed proposal to the mailing list but first I
> want to know if our community would accept a global temp table
> implementation which provides the same performance as currently temp tables
> do.
>
> Thanks,
> Zhaomo
>
>
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-07-08 08:26:22 Re: Improving log capture of TAP tests with IPC::Run
Previous Message Amit Kapila 2015-07-08 07:52:46 Re: RFC: replace pg_stat_activity.waiting with something more descriptive