Re: Different Choices For Index/Sequential Scan With And Without A Join In 7.2

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark kirkwood <markir(at)slingshot(dot)co(dot)nz>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Different Choices For Index/Sequential Scan With And Without A Join In 7.2
Date: 2001-08-26 15:44:54
Message-ID: 18604.998840694@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Mark kirkwood <markir(at)slingshot(dot)co(dot)nz> writes:
> Note that 'f1' = '2000-01-26' corrosponds to 'd0key' = 270 in the table
> 'dim0';

What do you mean by "corresponds to"? Is there a one-to-one mapping
between distinct values of fact0.d0key and distinct values of dim0.f1?
Or do you just mean that the values play corresponding roles in these
two queries?

> I used cpu_tuple_cost = 0.4, but everything else was fairly standard.

?? You're claiming that the CPU time involved in processing a single
tuple is 40% as large as the time to fetch a page from disk. Unless
you're running a high-end RAID array attached to an ENIAC, I don't
believe it. This adjustment almost certainly will produce silly results.

> It guess I was expecting the value that made the first query change
> from index to seq scan to be "close" to the value that made the second
> query use a sequential scan...

Um, are you considering the effects of statistical density of the
values? I see no particular reason to assume that a range of nine days
in a date column should be equally as selective as a range of nine
counts in an integer key column. It all depends on what fraction of the
table entries actually fall within those ranges.

Have you looked at the ANALYZE statistics for the tables? (You have
done an ANALYZE on them, I hope.) Try
select * from pg_stats where tablename = 'fact';
The user documentation about 7.2 statistics is nonexistent as yet, but
you can read src/include/catalog/pg_statistic.h for info.

If the tables are large and have irregular distributions, you might find
that increasing the statistics target value for the key columns helps
the optimizer to produce good plan choices. See ALTER TABLE SET
STATISTICS. I'd be interested to hear about it if so --- the current
default target of 10 was picked "out of the air" and might well be
off-base.

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2001-08-26 15:51:23 Re: "AND", "OR" and Materialize :((((
Previous Message Tom Lane 2001-08-26 15:24:34 Re: [SQL] undocumented setval()