Re: Compression and on-disk sorting

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rod Taylor <pg(at)rbt(dot)ca>, "Bort, Paul" <pbort(at)tmwsystems(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Compression and on-disk sorting
Date: 2006-05-18 16:07:05
Message-ID: 200605181607.k4IG75Y09778@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


Uh, TODO already has:

o %Add a GUC variable to control the tablespace for temporary objects
and sort files

It could start with a random tablespace from a supplied list and
cycle through the list.

Do we need to add to this?

---------------------------------------------------------------------------

Greg Stark wrote:
> "Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
>
> > Which means we need all the interface bits to be able to tell PostgreSQL
> > where every single temp storage area is. Presumably much of the
> > tablespace mechanism could be used for this, but it's still a bunch of
> > work. And you can't just say "I have 8 spindles", you have to tell
> > PostgreSQL exactly where to put each temporary area (unless you just
> > have it put one on every tablespace you have defined).
>
> Yes, if you have more than one temporary area you definitely need a way to
> tell Postgres where to put them since obviously they won't be in the postgres
> base directory. But I think that's all Postgres really needs to know.
>
> One could imagine a more complex version where Postgres has meta information
> about the bandwidth and seek penalty for each sort area separately. Presumably
> also for each table space. But that's a whole lot more complexity than
> Postgres's current cost model.
>
>
> > > that it should strive to maximize sequential reads within one temp area and
> > > expect switching between temp areas (which represent multiple spindles) to be
> > > better than multiplexing multiple tapes within a single temp area (which
> > > represents a single spindle).
> >
> > Which adds yet more complexity to all the code that uses the temp area.
> > And as others have brought up, you still have to allow for the case when
> > splitting all of this out into multiple files means you end up using
> > substantially more disk space. That further drives up the complexity.
>
> You also have to consider that it won't always be a benefit to spread the sort
> over multiple sort areas. If there's only one sort going on and you can reach
> a 1:1 ratio between tapes and spindles then I think it would be a huge
> benefit. Effectively boosting the sort speed by random_page_cost.
>
> But if you don't have as many spindles as your algorithm needs tapes
> then it's unclear which to multiplex down and whether you gain any benefit
> once you're multiplexing over simply using a single sort area.
>
> And worse, if there are multiple sorts going on in the system then you're not
> going to get sequential access even if you have multiple sort areas available.
> If you have N sort areas and N sorts are going on then you're probably better
> off multiplexing each one down to a single sort area and letting them each
> proceed without interfering with each other rather than having each one hog
> all the sort areas and forcing the OS to do the multiplexing blindly.
>
> > My point is that unless someone shows that there's a non-trivial
> > performance gain here, it's not going to happen.
>
> I think two extreme cases are well worth pursuing:
>
> 1) Use n sort areas for n tapes making everything purely sequential access.
> That would be useful for large DSS systems where large sorts are running
> and i/o bandwidth is high for sequential access. That gives effectively a
> random_page_cost speed boost.
>
> 2) Use the current algorithm unchanged but have each sort use a different sort
> area in some sort of round-robin fashion. That helps the OLTP type
> environment (ignoring for the moment that OLTP environments really ought
> not be doing disk sorts) where people complain about unreliable execution
> times more than slow execution times. If you can provide enough sort areas
> then it would remove one big reason other queries concurrent impact the
> execution time of queries.
>
> --
> greg
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>

--
Bruce Momjian http://candle.pha.pa.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Hallgren 2006-05-18 16:07:18 Re: Question about casts
Previous Message Robert Treat 2006-05-18 16:01:24 Re: Google and the Beta Freeze

Browse pgsql-patches by date

  From Date Subject
Next Message Jim C. Nasby 2006-05-18 16:18:51 Re: Compression and on-disk sorting
Previous Message Martijn van Oosterhout 2006-05-18 11:42:01 Re: [PATCH] Compression and on-disk sorting