Re: [PATCHES] Non-transactional pg_class, try 2

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [PATCHES] Non-transactional pg_class, try 2
Date: 2006-06-26 20:54:07
Message-ID: 20060626205407.GA11926@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Simon Riggs wrote:
> On Mon, 2006-06-26 at 13:58 -0400, Alvaro Herrera wrote:

> > The relminxid Patch
> > ===================
> >
> > What I'm after is not freezing for read-only media, nor archive, nor
> > read-only tables.
>
> OK, but I am... but I'm happy to not to confuse the discussion.

Ok :-) I think I put a note about this but removed it while
restructuring the text so it would be clearer. The note is that while I
don't care about read-only stuff in this proposal, it may be that
read-only tables may come as a "side effect of implementing this. But I
agree we should not make the discussion more complex than it already is.

> > 2. Mark frozen databases specially somehow.
> > To mark databases frozen, we need a way to mark tables as frozen.
> > How do we do that? As I explain below, this allows some nice
> > optimizations, but it's a very tiny can full of a huge amount of
> > worms.
>
> At this stage you talk about databases, yet below we switch to
> discussing tables. Not sure why we switched from one to the other.

Sorry, I forgot one step. To mark a database frozen, we must make sure
that all tables within that database are frozen as well. So the first
step to freezing a database is freezing all its tables.

> > Marking a Table Frozen
> > ======================
> >
> > Marking a table frozen is simple as setting relminxid = FrozenXid for a
> > table. As explained above, this cannot be done in a regular postmaster
> > environment, because a concurrent transaction could be doing nasty stuff
> > to a table. So we can do it only in a standalone backend.
>
> Surely we just lock the table? No concurrent transactions?

No, because a transaction can have been started previously and yet not
hold any lock on the table, and write on the table after the vacuum
finishes. Or write on an earlier page of the table, after the vacuuming
already processed it. But here it comes one of the "nice points" below,
which was that if we acquire a suitable exclusive lock on the table, we
_can_ mark it frozen. Of course, this cannot be done by plain vacuum,
because we want the table to be still accesible by other transactions.
This is where VACUUM FREEZE comes in -- it does the same processing as
lazy vacuum, except that it locks the table exclusively and marks it
with FrozenXid.

> > Nice optimization: if we detect that a table is fully frozen, then
> > VACUUM is a no-op (not VACUUM FULL), because by definition there are no
> > tuples to remove.
>
> Yes please, but we don't need it anymore do we? Guess we need it for
> backwards compatibility? VACUUM still needs to vacuum every table.

Sorry, I don't understand what you mean here. We don't need what
anymore?

> > Another optimization: if we are sure that unfreezing works, we can even
> > mark a table as frozen in a postmaster environment, as long as we take
> > an ExclusiveLock on the table. Thus we know that the vacuum is the sole
> > transaction concurrently accessing the table; and if another transaction
> > comes about and writes something after we're finished, it'll correctly
> > unfreeze the table and all is well.
>
> Why not just have a command to FREEZE and UNFREEZE an object? It can
> hold an ExclusiveLock, avoiding all issues. Presumably FREEZE and
> UNFREEZE are rare commands?

Ok, if I'm following you here, your point is that FREEZE'ing a table
sets the relminxid to FrozenXid, and UNFREEZE removes that; and also, in
between, no one can write to the table?

This seems to make sense. However, I'm not very sure about the
FREEZE'ing operation, because we need to make sure the table is really
frozen. So we either scan it, or we make sure something else already
scanned it; to me what makes the most sense is having a VACUUM option
that would do the freezing (and a separate command to do the
unfreezing).

> > Where are the problems in this approach?
> >
> > 2. The current implementation puts the unfreezing in LockRelation. This
> > is a problem, because any user can cause a LockRelation on any table,
> > even if the user does not have access to that table.
>
> That last bit just sounds horrible to me. But thinking about it: how
> come any user can lock a relation they shouldn't even be allowed to know
> exists? Possibly OT.

Hmm, I guess there must be several commands that open the relation and
lock it, and then check permissions. I haven't checked the code but you
shouldn't check permissions before acquiring some kind of lock, and we
shouldn't be upgrading locks either.

> I can see other reasons for having pg_class_nt, so having table info
> cached in shared memory does make sense to me (yet not being part of the
> strict definitions of the relcache).

Yeah.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2006-06-26 20:59:07 Re: "Truncated" tuples for tuple hash tables
Previous Message Tom Lane 2006-06-26 20:49:58 Re: [COMMITTERS] pgsql: Change the row constructor syntax (ROW(...))

Browse pgsql-patches by date

  From Date Subject
Next Message Magnus Hagander 2006-06-26 21:01:34 Re: Further patch for VS2005
Previous Message Tom Lane 2006-06-26 20:47:11 Re: Further patch for VS2005