Re: Re: 512, 600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bryce Nesbitt <bryce2(at)obviously(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, "Jorge Montero" <jorge_montero(at)homedecorators(dot)com>
Subject: Re: Re: 512, 600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?
Date: 2010-02-11 03:52:49
Message-ID: 28370.1265860369@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Bryce Nesbitt <bryce2(at)obviously(dot)com> writes:
> The query plans are now attached (sorry I did not start there: many
> lists reject attachments). Or you can click on "text" at the query
> planner analysis site http://explain.depesz.com/s/qYq

At least some of the problem is the terrible quality of the rowcount
estimates in the IN subquery, as you extracted here:

> Nested Loop (cost=0.00..23393.15 rows=23 width=4) (actual time=0.077..15.637 rows=4003 loops=1)
> -> Index Scan using words_word on words (cost=0.00..5.47 rows=1 width=4) (actual time=0.049..0.051 rows=1 loops=1)
> Index Cond: ((word)::text = 'insider'::text)
> -> Index Scan using article_words_wc on article_words (cost=0.00..23234.38 rows=12264 width=8) (actual time=0.020..7.237 rows=4003 loops=1)
> Index Cond: (article_words.word_key = words.word_key)
> Total runtime: 19.776 ms

Given that it estimated 1 row out of "words" (quite correctly) and 12264
rows out of each scan on article_words, you'd think that the join size
estimate would be 12264, which would be off by "only" a factor of 3 from
the true result. Instead it's 23, off by a factor of 200 :-(.

Running a roughly similar test case here, I see that 8.4 gives
significantly saner estimates, which I think is because of this patch:
http://archives.postgresql.org/pgsql-committers/2008-10/msg00191.php

At the time I didn't want to risk back-patching it, because there
were a lot of other changes in the same general area in 8.4. But
it would be interesting to see what happens with your example if
you patch 8.3 similarly. (Note: I think only the first diff hunk
is relevant to 8.3.)

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message jesper 2010-02-11 10:05:17 Re: perf problem with huge table
Previous Message Scott Marlowe 2010-02-11 03:46:52 Re: How exactly PostgreSQL allocates memory for its needs?