Re: Multi-pass planner

From: Mischa Sandberg <mischa(dot)sandberg(at)sophos(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Multi-pass planner
Date: 2009-08-20 22:11:16
Message-ID: CB0F73E165CFFA4496D12161D835562CF876ABC3@US-COL-EXCHMBX1.green.sophos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

In a federated database engine I built in the mid-90's,
it more or less ran both plans in parallel, to implement fast-first and min-total cost.
The index query in general started returning rows whose oids went into a filter
that discarded them from the serial query once it started to crank things out.

'index' and 'serial' is not exactly what was going on;
the federated engine was joining multiple tables across multiple (sometimes hundreds)
of databases, with really variable transmission times, and tolerance for timed-out db servers.
It had no reliable way to get cost estimates in advance,
since it didn't have access to statistical metadata (no, Postgres wasn't one
of the databases I had to take as input, more's the pity).

'Parallel' is also not quite accurate. It retained plans (queries were heavily repeated)
and did mickey-mouse simulated annealing of past performance, so that if one of two plans
was the faster 90% of the time, then there was a 10% probability it would run both plans in parallel,
just to check whether something had changed.

The code was part of a proprietary product, and was used by Acxiom and Cisco.
But the concept was pretty simple and as described above.

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Greg Stark
> Sent: Thursday, August 20, 2009 10:32 AM
> To: Robert Haas
> Cc: Kevin Grittner; decibel; Pg Hackers
> Subject: Re: [HACKERS] Multi-pass planner
>
> On Thu, Aug 20, 2009 at 6:28 PM, Greg Stark<gsstark(at)mit(dot)edu> wrote:
> > I don't think it's a bad idea, I just think you have to set your
> > expectations pretty low. If the estimates are bad there
> isn't really
> > any plan that will be guaranteed to run quickly.
>
> Actually this is usually Tom's point when this topic comes
> up. Say you're deciding between an index scan and a
> sequential scan. The sequential scan has a total cost of
> 1000..1000 but the index scan has an estimated total cost of
> 1..10000. If you pick the sequential scan you might be
> running 1000x slower than the index scan in the worst case.
> But if you pick the index scan you might be running 10x
> slower than the sequential scan in the worst case. If you
> don't trust the estimate where does that leave you? Making a
> mistake either way is fatal.
>
>
> --
> greg
> http://mit.edu/~gsstark/resume.pdf
>
> --
> Sent via pgsql-hackers mailing list
> (pgsql-hackers(at)postgresql(dot)org) To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Brendan Jurd 2009-08-21 01:14:21 Re: WIP: generalized index constraints
Previous Message Josh Berkus 2009-08-20 21:44:20 Re: converting between netmask formats