Re: Optimizing a huge_table/tiny_table join

From: <kynn(at)panix(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Optimizing a huge_table/tiny_table join
Date: 2006-05-25 16:31:04
Message-ID: 200605251631.k4PGV4O28645@panix3.panix.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On 5/24/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> <kynn(at)panix(dot)com> writes:
> > Limit (cost=19676.75..21327.99 rows=6000 width=84)
> > -> Hash Join (cost=19676.75..1062244.81 rows=3788315 width=84)
> > Hash Cond: (upper(("outer".id)::text) upper(("inner".id)::text))
> > -> Seq Scan on huge_table h (cost= 0.00..51292.43 rows=2525543 width=46)
> > -> Hash (cost=19676.00..19676.00 rows=300 width=38)
> > -> Seq Scan on tiny_table t (cost=0.00..19676.00 rows=300 width=38)
>
> Um, if huge_table is so much bigger than tiny_table, why are the cost
> estimates for seqscanning them only about 2.5x different? There's
> something wacko about your statistics, methinks.

Well, they're not my statistics; they're explain's. You mean there's
a bug in explain? I agree that it makes no sense that the costs don't
differ as much as one would expect, but you can see right there the
numbers of rows for the two tables. At any rate, how would one go
about finding an explanation for these strange stats?

More bewildering still (and infuriating as hell--because it means that
all of my work for yesterday has been wasted) is that I can no longer
reproduce the best query plan I posted earlier, even though the tables
have not changed at all. (Hence I can't post the explain analyze for
the best query plan, which Josh Drake asked for.) No matter what
value I use for LIMIT, the query planner now insists on sequentially
scanning huge_table and ignoring the available index. (If I turn off
enable_seqscan, I get the second worst query plan I posted yesterday.)

Anyway, I take it that there is no way to bypass the optimizer and
instruct PostgreSQL exactly how one wants the search performed?

Thanks!

kj

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Chris Mair 2006-05-25 16:35:28 Re: lowering priority automatically at connection
Previous Message Tom Lane 2006-05-25 16:26:43 Re: lowering priority automatically at connection