Data archiving/warehousing idea

From: Chris Dunlop <chris(at)onthe(dot)net(dot)au>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Data archiving/warehousing idea
Date: 2007-02-01 02:01:34
Message-ID: 20070201020134.GA7439@onthe.net.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

G'day hackers,

I had some hand-wavy thoughts about some potential gains for
postgres in the data archiving/warehousing area. I'm not able
to do any work myself on this, and don't actually have a
pressing need for it so I'm not "requesting" someone do it, but
I thought it might be worth discussing (if it hasn't been
already - I couldn't find anything in the mail archives, but
that doesn't mean it's not there...)

The main idea is that, there might be space utilisation and
performance advantages if postgres had "hard" read-only tables,
i.e. tables which were guaranteed (by postgres) to never have
their data changed (insert/update/delete).

This could potentially save disk space by allowing "book
keeping" elements in the page/tuple headers to be removed, e.g.
visibility information etc. Also, some indexes could
potentially be packed tighter if we know the data will never
change (of course this is already available using the fillfactor
control).

There could be performance advantages from areas like:

* more efficient disk buffering due to reduced disk space
requirements per above.

* no need to visit tuple store for visibility info
during index scan

* greatly reduced or even completely removed locking. If the
table is guaranteed read-only, there's no need to lock?

* Planner optimisation? E.g. changing the cost of
index and sequential scans for the table due to the previous
points, and there might be table stats which would be very
useful to the planner but which are too expensive to
maintain for changing data.

The idea would be to introduce a statement something like:

ALTER TABLE foo SET ARCHIVE;

This would tell postgres to rewrite the on-disk table to the
"read only" format, rewrite the indexes for maximum packing
and collect stats for the planner etc.

Thoughts? Brickbats?

Cheers,

Chris.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Koichi Suzuki 2007-02-01 02:07:27 Full page writes improvement
Previous Message Florian G. Pflug 2007-02-01 01:46:00 Re: [GENERAL] 8.2.1 Compiling Error