Re: Relation forks & FSM rewrite patches

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "pgsql-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Relation forks & FSM rewrite patches
Date: 2008-07-12 18:27:57
Message-ID: 22640.1215887277@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> Here's an updated version of the "relation forks" patch, and an
> incremental FSM rewrite patch on top of that. The relation forks patch
> is ready for review. The FSM implementation is more work-in-progress
> still, but I'd like to get some review on that as well, with the goal of
> doing more performance testing and committing it after the commit fest.

I looked through the "relation forks" patch, and think it's close to
committable, but I have a few minor gripes.

> The one part that I'm not totally satisfied in the relation forks patch
> is the smgrcreate() function. The question problem is: which piece of
> code decides which forks to create for a relation, and when to create
> them? I settled on the concept that all forks that a relation will need
> are created at once, in one smgrcreate() call. There's no function to
> create additional forks later on.

I think that's an extremely bad idea. You need look no further than
Zdenek's hopes of in-place-upgrade to see a need for adding a fork
to an existing relation; but even without that I can see possibly
wanting to not create a fork right away. I think smgrcreate() should
just create one fork as it's told (ie, same API you gave mdcreate).

> Currently, heap_create() decides which forks to create. That's fine at
> the moment, but will become problematic in the future, as it's
> indexam-specific which forks an index requires. That decision should
> really be done in the indexam. One possibility would be to only create
> the main fork in heap_create(), and let indexam create any additional
> forks it needs in ambuild function.

Problem goes away if you redo smgrcreate API as above.

Other nits:

relpath() is now in danger of buffer overrun: you need to increase
the palloc() sizes. Seems like a #define for the max number of digits
in a ForkNumber wouldn't be out of place (though I sure hope it never
gets past 1 ...).

Also, I strongly suggest *not* appending _0 to the name of the main fork's
file. This'd break on-disk compatibility and people's expectations,
for no particular reason.

Don't like the name NUM_FORKS; it seems to suggest that's the actual
number of forks in existence. MAX_NUM_FORKS would be better.

I think that setting InvalidForkNumber to -1 is unportable: there is
nothing compelling the compiler to represent enum ForkNumber as a signed
type, so the places where you assume a ForkNumber variable can hold
InvalidForkNumber all look like trouble. One solution is to include
InvalidForkNumber in the enum:

enum ForkNumber {
InvalidForkNumber = -1,
MAIN_FORKNUM = 0
};

This would be a bit annoying if someone wrote a switch statement listing
different possible fork numbers, as the compiler would complain if
there's no case for InvalidForkNumber; but I can't see a reason for code
like that to be common.

Another possibility is to set InvalidForkNumber = 0, MAIN_FORKNUM = 1;
which is still a bit of a type cheat if InvalidForkNumber isn't listed
in the enum, but one that's very unlikely to cause trouble. A bigger
problem is it would mess up a lot of your loops.

BTW, don't put a comma at end of enum ForkNumber, I think some compilers
will barf on that.

Last, don't forget to update the sgml docs chapter on database physical
storage.

regards, tom lane

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Jaime Casanova 2008-07-12 19:32:03 Re: Extending grant insert on tables to sequences
Previous Message Tom Lane 2008-07-12 17:32:13 Re: VACUUM Improvements - WIP Patch