Re: Temporary tables under hot standby

From: Noah Misch <noah(at)leadboat(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Temporary tables under hot standby
Date: 2012-04-26 02:16:11
Message-ID: 20120426021611.GB8116@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 25, 2012 at 11:49:23AM -0400, Robert Haas wrote:
> On Tue, Apr 24, 2012 at 11:55 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> > Our temporary tables are cataloged and filled like permanent tables. ?This has
> > the major advantage of making most code operate on tables with minimal regard
> > for their relpersistence. ?It also brings disadvantages:
> >
> > 1. High catalog turnover in rapid create/drop workloads. ?Heavy temporary
> > ? table users often need to REINDEX relation-oriented catalogs. ?Hot standby
> > ? cannot assign OIDs or modify system catalogs at all.
> > 2. Consumption of permanent XIDs for DML on the table. ?This increases COMMIT
> > ? cost on the master and is a non-starter under hot standby.
> > 3. autovacuum cannot VACUUM temporary tables, but their relfrozenxid values
> > ? delay pg_clog truncation and can trigger a wraparound-prevention shutdown.
> > 4. sinval traffic from every CREATE TEMP TABLE et al.
> > 5. CREATE TEMP TABLE records AccessExclusiveLock in the global lock table.
> > 6. We don't automatically drop temporary tables that existed at the point of a
> > ? crash, because they look much like permanent tables.
>
> #6 is already fixed in 9.1. Temporary tables now have names like
> tBACKENDID_RELFILENODE, and are cleaned up during start-up in exactly
> the same way you're propose to do it further down.

Ah, so it is. That simplifies things a bit.

> Maybe this is a silly idea, but if you're thinking about creating a
> local XID space and a global XID space, it might be a good idea to
> also make allowance for an "unlogged" XID space - that is, an XID
> space that is global to all backends but need not survive crashes.
> This would potentially allow unlogged tables to be used in HS mode. I
> would expect that you probably don't want to go as far as actually
> trying to make this work as part of your current project, but maybe as
> you're structuring the code it would be worth considering the
> possibility that we'll eventually want >2 XID spaces, either for this
> or other reasons.

Agreed. Plenty of the details would change (located in shared memory,
locking, persisted on clean shutdown, etc.), so I'm not sure how much actual
code could remain in common. If I encounter design decisions where one choice
seems to help cover this other use in the future, I'll keep it in mind.

> With respect to problem #5, I've been wondering if we couldn't just
> forget about taking AccessExclusiveLock when first creating a table
> (temporary or permanent). Unless and until the transaction commits,
> nobody can see the catalog entry anyway, so nobody else is going to
> attempt to take a conflicting lock. So why bother? Maybe there's
> some reason here that's not obvious to me. If not, it might be worth
> doing on general principle independent of this project.

Sounds safe, offhand. I do suspect the cost of the lock is peanuts compared
to the cost of inserting catalog entries, though, so I wouldn't anticipate a
measurable improvement from that change in isolation.

> On that note, I had a thought in the pat that it might be possible to
> do solve problem #1 by using global temp tables as system catalogs -
> that is, for each type of system catalog that relates to table
> creation, you'd have a permanent catalog and a global temp catalog.
> So if someone wants to create a temporary table of the existing
> variety on the standby, you can make all the entries in the
> global-temp version of pg_class, pg_attribute, etc. However, this
> seems extremely difficult to manage in general - there's a lot of code
> churn involved, and also possible temporary -> permanent dependencies;
> for example, the temporary table might have a pg_attrdef entry that
> needs to depend on a non-temporary pg_proc entry. That's tricky to
> solve on the master and even trickier to solve in HS operation. So
> I'm inclined to agree with you that it makes more sense to just aim to
> support global temp tables in HS mode, and if we want to beat our head
> against the brick wall of making regular temp tables work there
> eventually, that can be a later project.

Agreed. I hadn't thought of that dependencies problem. Interesting.

> > Past discussions have raised the issue of interaction between commands like
> > ALTER TABLE and sessions using the new-variety temporary table. ?As a first
> > cut, let's keep this simple and have ongoing use of the table block operations
> > requiring AccessExclusiveLock. ?Note that you can always just make a new
> > temporary table with a different name to deploy a change quickly. ?Implement
> > this with a heavyweight lock having a novel life cycle. ?When a session first
> > takes an ordinary relation lock on the table, acquire the longer-term lock and
> > schedule it for release on transaction abort. ?On TRUNCATE, schedule a release
> > on transaction commit. ?Of course, also release the lock at session end.
>
> I'm not sure I believe this will work, but maybe I'm just not understanding it.

Did you have a specific doubt? I did gloss over all the details, having not
worked them out yet.

> > For new-variety temporary tables, change file names from "relfilenode[_fork]"
> > to "refilenode[_fork].pid.localnode". ?During crash recovery, delete all files
> > conforming to that pattern for refilenodes of known temporary tables. ?This
> > also lets backends share pg_class.relfilenode. ?The "localnode" field would
> > distinguish multiple generations of a table across VACUUM FULL, CLUSTER, and
> > TRUNCATE. ?We could build on this strategy to safely resolve point 6 for the
> > existing kind of temporary table, but that entails enough other details to
> > probably not mix it into the same patch.
>
> In lieu of including localnode in the filename, I would suggest that
> for a global temp table, we set relfilenode = 0, and let each backend
> assign one (and reassign new ones) whenever it feels like it.

Good call.

Thanks,
nm

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2012-04-26 02:17:33 Re: psql omits row count under "\x auto"
Previous Message Noah Misch 2012-04-26 02:04:11 Re: Temporary tables under hot standby