Re: Big 7.1 open items

From: Don Baccus <dhogaza(at)pacifier(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jan Wieck <JanWieck(at)yahoo(dot)com>
Cc: Hiroshi Inoue <Inoue(at)tpf(dot)co(dot)jp>, Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "Ross J(dot) Reedstrom" <reedstrm(at)rice(dot)edu>
Subject: Re: Big 7.1 open items
Date: 2000-06-16 17:50:23
Message-ID: 3.0.1.32.20000616105023.011dbdb0@mail.pacifier.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

At 11:46 AM 6/16/00 -0400, Tom Lane wrote:

>OK, to get back to the point here: so in Oracle, tables can't cross
>tablespace boundaries,

Right, the construct AFAIK is "create table/index foo on tablespace ..."

> but a tablespace itself could span multiple
>disks?

Right.

>Not sure if I like that better or worse than equating a tablespace
>with a directory (so, presumably, all the files within it live on
>one filesystem) and then trying to make tables able to span
>tablespaces. We will need to do one or the other though, if we want
>to have any significant improvement over the current state of affairs
>for large tables.

Oracle's way does a reasonable job of isolating the datamodel
from the details of the physical layout.

Take the OpenACS web toolkit, for instance. We could take
each module's tables and indices and assign them appropriately
to various dataspaces, then provide a separate .sql files with
only "create tablespace" statements in there.

By modifying that one central file, the toolkit installation
could be customized to run anything from a small site (one
disk with everything on it, ala my own personal webserver at
birdnotes.net) or a very large site with many spindles, with
various index and table structures spread out widely hither
and thither.

Given that the OpenACS datamodel is nearly 10K lines long (including
many comments, of course), being able to customize an installation
to such a degree by modifying a single file filled with "create
tablespaces" would be very attractive.

>One way is to play the flip-the-path-ordering game some more,
>and access multiple-segment tables with pathnames like this:
>
> .../TABLESPACE/RELATION -- first or only segment
> .../TABLESPACE/N/RELATION -- N'th extension segment
>
>This isn't any harder for md.c to deal with than what we do now,
>but by making the /N subdirectories be symlinks, the dbadmin could
>easily arrange for extension segments to go on different filesystems.

I personally dislike depending on symlinks to move stuff around.
Among other things, a pg_dump/restore (and presumably future
backup tools?) can't recreate the disk layout automatically.

>We'd still want to create some tools to help the dbadmin with slinging
>all these symlinks around, of course.

OK, if symlinks are simply an implementation detail hidden from the
dbadmin, and if the physical structure is kept in the db so it can
be rebuilt if necessary automatically, then I don't mind symlinks.

> But I think it's critical to keep
>the low-level file access protocol simple and reliable, which really
>means minimizing the amount of information the backend needs to know to
>figure out which file to write a page in. With something like the above
>you only need to know the tablespace name (or more likely OID), the
>relation OID (+name or not, depending on outcome of other argument),
>and the offset in the table. No worse than now from the software's
>point of view.

Make the code that creates and otherwise manipulates tablespaces
do the work, while keeping the low-level file access protocol simple.

Yes, this approach sounds very good to me.

- Don Baccus, Portland OR <dhogaza(at)pacifier(dot)com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2000-06-16 17:51:37 OK, OK, Hiroshi's right: use a seperately-generated filename
Previous Message Ed Loehr 2000-06-16 17:38:01 planner question re index vs seqscan

Browse pgsql-patches by date

  From Date Subject
Next Message Don Baccus 2000-06-16 18:14:35 Re: Big 7.1 open items
Previous Message Tom Lane 2000-06-16 17:08:38 Re: Big 7.1 open items