Re: [SQL] 7.4 - FK constraint performance

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [SQL] 7.4 - FK constraint performance
Date: 2004-02-18 17:27:29
Message-ID: 24413.1077125249@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql

Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com> writes:
> I did this, and changed the foreign keys to use it, but I haven't managed
> to build a fk case where I could actually detect a change in the plan
> chosen. Since the queries are only a simple scan on the one table I'm
> wondering if it's basically just modifying both costs by the same value
> which means there's no real effect at all.

After fooling with this, I think you are right. The planner is modeling
both cases as a linear slope from zero to estimated-total-cost, and
since the total number of tuples to be returned is the same, taking a
percentage won't change the outcome.

The only way we could improve the situation would be to somehow instruct
the planner that even though we don't know the parameter value to be
used when we are planning, it should expect that that value is *not* in
the table, rather than expecting that it has a distribution similar to
what is in the table. Seems pretty messy.

I have occasionally speculated about postponing planning of
parameterized queries until they are first executed, and then using the
actual parameter values supplied in that first execution for purposes of
estimating costs and selectivity. That would work pretty nicely in this
particular case, but in the real world I think it'd be mighty dangerous;
you could end up optimizing for an outlier case that isn't
representative of the queries you'll see later.

Another interesting line of thought is to let the user supply
representative values to be used for planning purposes. In PREPARE,
you could imagine saying something like

PREPARE myquery (int = 42, text = 'http://www...') AS
SELECT ... where url like $2 ...

and then using the sample value 'http://www...' for purposes of
estimating the LIKE result. Then it'd be on the user's head to pick
good representatives --- but he could make sure that they really were
representative, and not have to worry about luck of the draw from the
first live query.

I'm not sure if we could use such a feature automatically to bias FK
queries in the right direction, but it's something to think about.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2004-02-18 18:06:27 Re: NO WAIT ...
Previous Message Hans-Jürgen Schönig 2004-02-18 17:14:15 NO WAIT ...

Browse pgsql-sql by date

  From Date Subject
Next Message Jeremy Smith 2004-02-18 18:06:32 Re: CHAR(n) always trims trailing spaces in 7.4
Previous Message Richard Huxton 2004-02-18 16:08:22 Re: bytea or blobs?