Re: For the ametures. (related to "Are we losing

From: cbbrowne(at)cbbrowne(dot)com
To: Kevin Brown <kevin(at)sysexperts(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: For the ametures. (related to "Are we losing
Date: 2003-04-18 12:32:25
Message-ID: 20030418123225.4CAD6586AB@cbbrowne.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kevin Brown wrote:
> Tom Lane wrote:
> > Kevin Brown <kevin(at)sysexperts(dot)com> writes:
> > > It occurs to me that we could make it possible to get some of the
> > > performance gains MySQL gets through its naming conventions by
> > > including the type of object in the path of the object.
> >
> > "Performance gains"? Name one.
>
> Instead of tables and their indexes being on the same platter, you'd
> be able to put them on separate platters. Sounds like it would likely
> yield a performance gain to me...
>
> > We have been there and done that. I see no reason to go back.
>
> I'm not proposing that we return to calling the individual files (or
> the database they reside in) by name, only that we include a "type"
> identifier in the path so that objects of different types can be
> located on different spindles if the DBA so desires. As it is right
> now, tables and indexes are all stored in the same directory, and
> moving the indexes to a different spindle is an uncertain operation at
> best (you get to shut down the database in order to move any
> newly-created indexes, and dropping a moved index will not free the
> space occupied by the index as it'll only remove the symlink).

The thing is, this isn't necessarily particularly useful in managing the
partitioning of data across disks.

If I have, defined, /disk1, /disk2, /disk3, /disk4, and /disk5, it is highly
unlikely that my partitioning will be based on the notion of "put indices on
disk1, tables on disk2, and, well, skip the others."

I'm liable to want WAL separate from all the others, for a start, but then
look for what to put on different disks based on selecting particular tables
and indices as candidates.

Your observation about the dropping of a moved index is well taken; that would
point to the idea that the top level "thing" containing each table/index
perhaps should be a directory, with two interesting properties:

- By being a directory, and putting files in it, this allows extensions to be
more clearly tied to the table/index when a file grows towards the
not-uncommon 2GB barrier;

- In order for the linking to physical devices to be kept under control,
particularly if an index gets dropped and recreated, the postmaster needs to
be able to establish the links, suggesting an extension to syntax. At first
blush:

CREATE INDEX FROBOZZ_IDX LOCATION '/disk1/pgindices' on FROBOZZ(ID);

Supposing the OID number was 234231, the postmaster would then create the
symbolic link from $PGDATA/base/234231 to the freshly-created directory
/disk1/pgindices/234231, where the index would reside. (And if the directory
exists, there should be some complaint :-).)

I have made that up out of whole cloth; it _doesn't_ take into consideration
how you would specify the location of implicitly-created indices.

But it seems a useful approach that can be robust, and where it's even
plausible that the postmaster could cope with a request to shift a table or
index to another location. (Which would, quite naturally, put a lock on
access to the object for the duration of the operation.)
--
output = reverse("gro.gultn@" "enworbbc")
http://www.ntlug.org/~cbbrowne/
"The dinosaurs died because they didn't have a space program."
-- Arthur C Clarke

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message cbbrowne 2003-04-18 12:39:45 Re: pg_clog woes with 7.3.2 - Episode 2
Previous Message Kevin Brown 2003-04-18 09:58:19 Re: pg_clog woes with 7.3.2 - Episode 2