Re: suggestion to improve planer

From: Ľubomír Varga <luvar(at)plaintext(dot)sk>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: suggestion to improve planer
Date: 2009-09-09 17:34:49
Message-ID: 200909091934.49646.luvar@plaintext.sk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday 09 September 2009 14:11:41 Peter Eisentraut wrote:
> On Thu, 2009-09-03 at 10:35 +0200, Ľubomír Varga wrote:
> > Hi.
> >
> > I hope, that this is right mailing list.
> >
> > SELECT date, value FROM t_event
> > WHERE t_event.id in (SELECT id FROM t_event
> > WHERE date < '2009-08-25'
> > ORDER BY date DESC LIMIT 1)
> > ORDER BY date;
> > cost 6.4
> >
> > SELECT date, value FROM t_event
> > WHERE t_event.id = (SELECT id FROM t_event
> > WHERE date < '2009-08-25'
> > ORDER BY date DESC LIMIT 1)
> > ORDER BY date;
> > cost 6.36..6.37
> >
> >
> > Why that two query dont have equal cost? If it is not problem, try add
> > some planer code to recognize that sublesect HAVE TO return just one row
> > (limit 1) and in plan could be used filter/index scan instead of hash
> > aggregate.
>
> Well, there is always a tradeoff between more planner analysis and more
> complicated and slow planning. Seeing that the cost estimates are close
> enough for practical purposes, it doesn't seem worthwhile to fix
> anything here.
>
> > I have
> > also some complex query examples where cost difference is more visible.
>
> Having real examples where a change might actually improve runtime is
> always more interesting than an academic exercise like the above.

Oka, real world example is attached as txt file. There are 3x2 queries and its
costs on my system/database. System/database is also real, and production.
First set for short time interval and second for long time interval.

Main poblem is, that I want to generate some statistical report and I dont
know for how many "devices" Iam going to generate it for. So if I make some
program, there will be something like:
..."t_device.imei in (" + this.getDevicesImeis() + ")"...
If i have only one device, there could be "=" instead of "in".

PS: Iam very confused about actual costs. I dont have exact query on which I
have seen that "in" for one device in array, is much worse than "=" for one
device. Actual costs are somehow different and "=" select gives about ten
time worse cost.

So probably just ignore my mail and keep planner as is.
My version of postgre: "PostgreSQL 8.2.6 on x86_64-pc-linux-gnu, compiled by
GCC x86_64-pc-linux-gnu-gcc (GCC) 4.1.1 (Gentoo 4.1.1-r3)"

--
Odborník na všetko je zlý odborník. Ja sa snažím byť výnimkou potvrdzujúcou
pravidlo.

Attachment Content-Type Size
realQueries.txt text/plain 5.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-09-09 17:39:14 Re: RfD: more powerful "any" types
Previous Message Tom Lane 2009-09-09 17:29:18 Re: RfD: more powerful "any" types