Re: heap metapages

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: heap metapages
Date: 2012-05-21 18:37:01
Message-ID: CA+TgmoaTaW9+OD2V8caMQ21rKdSAVYfFDk8mOXj-wnfNjOAfOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 21, 2012 at 2:22 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Mon, May 21, 2012 at 12:56 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> At dinner on Friday night at PGCon, the end of the table that included
>> Tom Lane, Stephen Frost, and myself got to talking about the idea of
>> including some kind of metapage in every relation, including heap
>> relations.  At least some index relations already have something like
>> this (cf _bt_initmetapage, _hash_metapinit).  I believe that adding
>> this for all relations, including heaps, would allow us to make
>> improvements in several areas.
>
> The first thing that jumps to mind is: why can't the metapage be
> extended to span multiple pages if necessary?  I've often wondered why
> the visibility map isn't stored within the heap itself...

Well, the idea of a metapage, almost by definition, is that it stores
a small amount of information whose size is pretty much fixed and
which can be reasonably anticipated to always fit in one page. If
you're trying to store some data that can get bigger than that (or
even, come close to filling that up), you need a different system.
I'm anticipating that the amount of relation metadata we need to store
will fit into a 512-byte sector with significant room left over,
leaving us with the rest of the block for whatever we'd like to use it
for (e.g. bits of the FSM or VM). If at some point in the future, we
need some kind of relation-level metadata that can grow beyond a
handful of bytes, we can either put it in its own fork, or store one
or more block pointers in the metapage indicating the blocks where
information is stored - but right now I'm not seeing the need for
anything that fancy.

Now, that having been said, I don't think there's any particular
reason why we coudn't multiplex all the relation forks onto a single
physical file if we were so inclined. The FSM and VM are small enough
that interleaving them with the actual data probably wouldn't slow
down seq scans materially. But on the other hand I am not sure that
we'd gain much by it in general. I see the value of doing it for
small relations: it saves inodes, potentially quite a lot of inodes if
you're on a system that uses schemas to implement multi-tenancy. But
it's not clear to me that it's worthwhile in general. Sticking all
the FSM stuff in its own relation may allow the OS to lay out those
pages physically closer to each other on disk, whereas interleaving
them with the data blocks would probably give up that advantage, and
it's not clear to me what we'd be getting in exchange.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-05-21 18:37:31 Re: transformations between types and languages
Previous Message Merlin Moncure 2012-05-21 18:22:12 Re: heap metapages