Skip site navigation (1) Skip section navigation (2)

Re: [PERFORM] CLUSTER command

From: Jean-Luc Lachance <jllachan(at)nsd(dot)ca>
To: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
Cc: johnnnnnn <john(at)phaedrusdeinus(dot)org>,pgsql-performance(at)postgresql(dot)org, pgsql-general(at)postgresql(dot)org
Subject: Re: [PERFORM] CLUSTER command
Date: 2002-12-13 16:42:25
Message-ID: 3DFA0DF1.DE3B5462@nsd.ca (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-interfacespgsql-performance
Stephan,

Someone commented earlier about the separation/abstraction of the
storage manager.
I agree that it should not be done at the storage level.

Maybe a better idea, would be to create a new pg_partition table that
would have the functionality of an index on the key field and also be
used to point to a file/table ID.

That would be alot more work to code on thet planner though.

If a newly inherited table could also inherite the constraints and
indecies of its parent maybe things would be easier.

JLL


Stephan Szabo wrote:
> 
> On Thu, 12 Dec 2002, johnnnnnn wrote:
> 
> > On Thu, Dec 12, 2002 at 05:39:44PM -0500, Jean-Luc Lachance wrote:
> > > Let's create a new command:
> > >
> > > PARTITION <table> ON <attribute>
> > <snip>
> > > Because the table file is already extended (2G limit) using
> > > different files extension (.N)
> > > how complicated (modifying the code) would it be to have the table
> > > files split according to the cluster key?
> >
> 
> > I think the code changes would be complicated. Just at a 30-second
> > consideration, this would need to touch:
> > - all sql (selects, inserts, updates, deletes)
> > - vacuuming
> > - indexing
> > - statistics gathering
> > - existing clustering
> 
> I think his idea was to treat it similarly to the way that the
> system treats tables >2G with .N files.  The only thing is that
> I believe the code that deals with that wouldn't be particularly
> easy to change to do it though, but I've only taken a cursory look at
> what I think is the place that does that(storage/smgr/md.c). Some sort of
> good partitioning system would be nice though.
> 
> > create table u1 (...);
> > create table u2 (...);
> > create table u3 (...);
> >
> > create view uv as (select "A" as partition_key, ... from u1
> >                    union all
> >                    select "B" as partition_key, ... from u2
> >                    union all
> >                    select "C" as partition_key, ... from u3);
> >
> > That keeps the tables in different files on-disk while still allowing
> > you to query against all of them. You need to index them separately
> > and logic is necessary when changing data.
> 
> Unfortunately, I think that the optimizer isn't going to do what you'd
> hope here and scan only the appropriate table if you were to say
> partition_key='A' and foo='bar'.  I'd love to be shown that I'm wrong, but
> the best I could see hoping for would be that if partition_key was part of
> u1-u3 and there was an index on partition_key,foo that it could use that
> and do minimal work on the other tables.
> 
> In addition, doing something like the above is a nightmare if you don't
> know beforehand what the partitions should be (for example if you know
> there aren't alot of distinct values, but you don't know what they are) or
> for that matter even with 10-15 partitions, writing the rules and such
> would probably be really error prone.
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

In response to

pgsql-performance by date

Next:From: Hannu KrosingDate: 2002-12-13 17:00:32
Subject: Re: automated index suggestor -- request for comment
Previous:From: johnnnnnnDate: 2002-12-13 15:37:19
Subject: Re: automated index suggestor -- request for comment

pgsql-interfaces by date

Next:From: Bruce MomjianDate: 2002-12-14 22:58:30
Subject: Re: [GENERAL] PerformPortalClose warning in 7.3
Previous:From: Stephan SzaboDate: 2002-12-13 02:11:50
Subject: Re: [PERFORM] CLUSTER command

pgsql-general by date

Next:From: Thomas BeutinDate: 2002-12-13 16:46:25
Subject: Re: pg_hba.conf authorization question
Previous:From: Stephan SzaboDate: 2002-12-13 16:41:57
Subject: Re: Copy/foreign key contraints

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group