Tablespaces

From: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Tablespaces
Date: 2004-02-26 10:07:49
Message-ID: Pine.LNX.4.58.0402261942440.16702@linuxworld.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-hackers-win32

Hi all,

I've been looking at implementing table spaces for 7.5. Some notes and
implementation details follow.

------

Type of table space:

There are many different table space implementations in relational
database management systems. In my implementation, a table space in
PostgreSQL will be the location of a directory on the file system in
which files backing database objects can be stored. Global tables and
non 'data' objects (WAL, CLOG, config files) will all remain in $PGDATA.
$PGDATA/base will be the default table space.

A given table space will be identified by a unique table space name. I
haven't decided if 'unique' should mean database-wide unique or
cross-database unique. It seems to me that we might run into problems
with CREATE DATABASE ... TEMPLATE = <database with table spaces> if the
uniqueness of table spaces is limited to the database level.

A table space parameter will be added to DDL commands which create
physical database objects (CREATE DATABASE/INDEX/TABLE/SEQUENCE) and to
CREATE SCHEMA. The associated routines, as well as the corresponding DROP
commands will need to be updated. Adding the ability to ALTER <object>
TABLESPACE <name> seems a little painful. Would people use it? Comments?

When an object is created the system will resolve the table space the
object is stored in as follows: if the table space paramater is passed to
the DDL command, then the object is stored in that table space (given
validation of the table space, etc). If it is not passed, the object
inherits its "parent's" table space where the parent/child hierarchy is as
follows: database > schema > table > [index|sequence]. So, if you issued:

create table foo.bar (...);

We would first not that there is no TABLESPACE <name>, then cascade to
the table space for the schema 'foo' (and possibly cascade to the table
space for the database). A database which wasn't created with an explicit
table space will be created under the default table space. This ensures
backward compatibility.

Creating a table space:

A table space is a directory structure. The directory structure is as
follows:

[swm(at)dev /path/to/tblspc]$ ls
OID1/ OID2/

OID1 and OID2 are the OIDs of databases which have created a table space
against this file system location. In this respect, a table space
resembles $PGDATA/base. I thought it useful to keep this kind of
namespace mechanism in place so that administrators do not need to create
hierarchies of names on different partitions if they want multiple
databases to use the same partition.

The actual creation of the table space will be done with:

CREATE TABLE SPACE <name> LOCATION </path/to/tblspc>;

Before creating the table space we must:

1) Check if the directory exists. If it does, create a sub directory as
the OID of the current database.

2) Alternatively, if the directory doesn't exist, attempt to create it,
then the sub directory.

I wonder if a file, such as PG_TBLSPC, should be added to the table space
directory so that, in the case of an existing non-empty directory, we can
attempt to test if the directory is being used for something else and
error out. Seems like:

CREATE TABLESPACE tbl1 LOCATION '/var/'

which will result in something like '/var/123443' is a bad idea. Then
again, the user should know better. Comments?

If everything goes well, we add an entry to pg_tablespace with the table
space location and name (and and OID).

Tying it all together:

The catalogs pg_database, pg_namespace, and pg_class will have a 'tblspc'
field. This will be the OID of the table space the object resides in, or 0
(default table space). Since we can then resolve relid/relname, schema and
database to a tablespace, there aren't too many cases when extra logic
needs to be added to the IO framework. In fact, most of it is taken care
of because of the abstraction of relpath().

The creation of table spaces will need to be recorded in xlog in the same
way that files are in heap_create() with the corresponding delete logic
incase of ABORT.

Postmaster startup:

Ideally, the postmaster at startup should go into each tblspc/databaseoid
directory and check for a postmaster.pid file to see if some other
instance is touching the files we're interested in. This will require a
control file listing tblspc/databaseoid paths and it will need to plug
into WAL in case we die during CREATE TABLESPACE. Comments?

Creating a database

I think that createdb() is going to have to be reworked if pg_tablespace
isn't shared (ie, tablespaces are only database unique). The reason being
that if we create a database which has a table space, pg_tablespace in the
new database will have to be updated and that cannot be done atomically
with the `cp` based mechanism we currently use.

I think I'm going to have to get my hands dirty before I can tell the
extent to which createdb() will need reworking.

pg_dump

Obviously pg_dump will need to be able to dump table spaces. pg_dump
running against <7.5 will DDL commands without a table space parameter and
as such the database's physical layout, when loaded into 7.5, will be the
same as for <7.5.

---

Comments? Questions? Suggestions?

Thanks,

Gavin

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shridhar Daithankar 2004-02-26 10:38:50 Re: Tablespaces
Previous Message Honza Pazdziora 2004-02-26 08:39:39 Arbitrary collation support for PostgreSQL

Browse pgsql-hackers-win32 by date

  From Date Subject
Next Message Shridhar Daithankar 2004-02-26 10:38:50 Re: Tablespaces
Previous Message Bruce Momjian 2004-02-25 20:01:30 win32 patch