Re: Map forks (WIP)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "pgsql-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Map forks (WIP)
Date: 2008-05-19 23:42:15
Message-ID: 11793.1211240535@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> Heikki Linnakangas wrote:
>> I'm not completely satisfied with the way this looks, so I'll try a
>> slightly different approach next: Instead of having one SMgrRelation per
>> fork, add an extra ForkNumber argument to all the smgr functions.

> Here's an updated patch using the above approach. Looks much better now,
> IMO. The FSM implementation is still an incomplete mess, so don't bother
> looking into freespace.c or indexfsm.c, but I'm quite happy with the
> forks stuff now.

I took a quick look through this, and concur that this way is better
than one SMgrRelation per fork.

One thing I did *not* like was changing the FSM API to refer to Relation
rather than RelFileNode --- I don't believe that's a good idea at all.
In particular, consider what happens during TRUNCATE or CLUSTER: it's
not very clear how you'll tell the versions of the relation apart.
If you want to push the FSM API up to use SMgrRelation instead of
RelFileNode, that'd be okay, but not Relation. (Essentially the
distinction I'm trying to preserve here is logical vs physical
relation.)

The XLogOpenRelationWithFork stuff needs to be re-thought also,
as again this is blurring the question of what's a logical and
what's a physical relation --- and if forknum isn't part of the
relation ID, that API is wrong either way. I'm not sure about
a good solution in this area, but I wonder if the right answer
might be to make the XLOG replay stuff use SMgrRelations instead
of bogus Relations. IIRC the replay code design predates the
existence of SMgrRelation, so maybe we need a fresh look there.

(On closer look, XLogOpenRelationWithFork seems unused anyway
... maybe this is just a wart left over from the prior version
of the patch? But it seems to me that the replay code is still
doing the wrong thing in using Relation to refer to a physical
rather than logical rel.)

One really trivial thing that grated on me was

+ /*
+ * In a few places we need to loop through 0..MAX_FORKS to discover which
+ * forks exists, so we should try to keep this number small.
+ */
+ #define MAX_FORKS (FSM_FORKNUM + 1)

I think you should either call it MAX_FORK (equal to the last fork
number) or NUM_FORKS (equal to last fork number plus one). As is,
it's just confusing. And the comment is flat out wrong for the
current usage.

BTW, it would probably be a good idea to try to get the fork access
API committed before you work on FSM. Whenever you can break a
big patch into successive sections, it's a good idea, IMHO. I don't
think there's any doubt that we are going to go in this direction,
so I see no objection to committing fork-based API revisions in advance
of having any real use for them.

regards, tom lane

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2008-05-20 00:22:39 Re: libpq object hooks (libpq events)
Previous Message Zoltan Boszormenyi 2008-05-19 19:22:32 Re: [HACKERS] WITH RECURSIVE patch V0.1