Skip site navigation (1) Skip section navigation (2)

Re: A more general approach (Re: Dataarchiving/warehousing idea)

From: Hannu Krosing <hannu(at)skype(dot)net>
To: Gavin Sherry <swm(at)alcove(dot)com(dot)au>
Cc: Chris Dunlop <chris(at)onthe(dot)net(dot)au>, pgsql-hackers(at)postgresql(dot)org,Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: A more general approach (Re: Dataarchiving/warehousing idea)
Date: 2007-02-01 12:51:51
Message-ID: 1170334311.3226.12.camel@localhost.localdomain (view raw or flat)
Thread:
Lists: pgsql-hackers
Ühel kenal päeval, N, 2007-02-01 kell 14:38, kirjutas Hannu Krosing:
> Ühel kenal päeval, N, 2007-02-01 kell 13:24, kirjutas Gavin Sherry:
> 
> > A different approach discussed earlier involves greatly restricting the
> > way in which the table is used. This table could only be written to if an
> > exclusive lock is held; on error or ABORT, the table is truncated.
> > 
> > The problem is that a lot of this looks like a hack and I haven't seen a
> > very clean approach which has gone beyond basic brain dump.
> 
> A more radical variation of the "restricted-use archive table" approach
> is storing all tuple visibility info in a separate file.
> 
> At first it seems to just add overhead, but for lots (most ? ) usecases
> the separately stored visibility should be highly compressible, so for
> example for bulk-loaded tables you could end up with one bit per page
> saying that all tuples on this page are visible.
> 
> Also this could be used to speed up vacuums, as only the visibility
> table needs to be scanned duting phase 1 of vacuum, and so tables with
> localised/moving hotspots can be vacuumed withoutd scanning lots of
> static data.
> 
> Also, storing the whole visibility info, but in a separate heap, lifts
> all restrictions of the "restricted-use archive table" variant. 
> 
> And the compression of visibility info (mostly replacing per-tuple info
> with per-page info) can be carried out by a separate vacuum-like
> process.
> 
> And it has many of the benefits of static/RO tables, like space saving
> and index-only queries. Index-only will of course need to get the
> visibility info from visibility heap, but if it is mostly heavily
> compressed, it will be a lot cheaper than random access to data heap.

For tables with fixed-width tuples it can probably be extended to
support vertical fragmentation as well, to get DWH benefits similar to
http://monetdb.cwi.nl/ .

-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




In response to

pgsql-hackers by date

Next:From: Simon RiggsDate: 2007-02-01 13:09:16
Subject: Re: A more general approach (Re: Dataarchiving/warehousing idea)
Previous:From: Hannu KrosingDate: 2007-02-01 12:38:14
Subject: A more general approach (Re: Data archiving/warehousingidea)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group