Re: strange query behavior

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Tim Jones" <TJones(at)optio(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: strange query behavior
Date: 2006-12-13 23:24:55
Message-ID: 12988.1166052295@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"Tim Jones" <TJones(at)optio(dot)com> writes:
> [ explain results ]

As best I can see, the problem is with the estimate of the size of the
inner join: for two keys we have

-> Nested Loop (cost=4.01..9410.49 rows=13 width=145) (actual time=0.227..0.416 rows=30 loops=1)
-> Bitmap Heap Scan on battery (cost=4.01..11.64 rows=2 width=69) (actual time=0.135..0.138 rows=2 loops=1)
Recheck Cond: ((batteryidentifier = 1177470) OR (batteryidentifier = 1177469))
-> BitmapOr (cost=4.01..4.01 rows=2 width=0) (actual time=0.106..0.106 rows=0 loops=1)
-> Bitmap Index Scan on ix_battery_id (cost=0.00..2.00 rows=1 width=0) (actual time=0.091..0.091 rows=1 loops=1)
Index Cond: (batteryidentifier = 1177470)
-> Bitmap Index Scan on ix_battery_id (cost=0.00..2.00 rows=1 width=0) (actual time=0.011..0.011 rows=1 loops=1)
Index Cond: (batteryidentifier = 1177469)
-> Index Scan using ix_obresults_bat on observationresults (cost=0.00..4682.40 rows=1362 width=76) (actual time=0.047..0.091 rows=15 loops=2)
Index Cond: ("outer".batteryidentifier = observationresults.batteryidentifier)

but for one key we have

-> Nested Loop (cost=0.00..4653.67 rows=1348 width=145) (actual time=0.166..0.215 rows=9 loops=1)
-> Index Scan using ix_battery_id on battery (cost=0.00..5.81 rows=1 width=69) (actual time=0.079..0.082 rows=1 loops=1)
Index Cond: (batteryidentifier = 1177470)
-> Index Scan using ix_obresults_bat on observationresults (cost=0.00..4634.38 rows=1348 width=76) (actual time=0.079..0.102 rows=9 loops=1)
Index Cond: (1177470 = batteryidentifier)

The large rowcount estimate makes it back off to a non-nestloop
plan for the outer joins, and in this situation that's a loser.

I'm actually not sure why they're not both too high --- with the
rowcount estimate of 1362 for the inner scan in the first example, you'd
expect about twice that for the join result. But the immediate problem
is that in the case where it knows exactly what batteryidentifier is
being probed for, it's still off by more than a factor of 100 on the
rowcount estimate for observationresults. How many rows in
observationresults, and may we see the pg_stats entry for
observationresults.batteryidentifier?

It's likely that the answer for you will be "raise the statistics target
for observationresults and re-ANALYZE", but I'd like to gather more info
about what's going wrong first.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2006-12-13 23:27:27 Re: Insertion to temp table deteriorating over time
Previous Message Steven Flatt 2006-12-13 23:17:41 Re: Insertion to temp table deteriorating over time