Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Fernando Schapachnik <fpscha(at)via-net-works(dot)net(dot)ar>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Neverending query on 6.5.2 over Solaris 2.5.1
Date: 1999-10-23 23:07:19
Message-ID: 2359.940720039@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fernando Schapachnik <fpscha(at)ns1(dot)via-net-works(dot)net(dot)ar> writes:
>> It'd be useful to double-check my theory that the system is
>> misestimating the selectivity of the WHERE (u.activa) clause.
>> You could try this:
>> SELECT count(*) FROM usarios WHERE activa;

> 10571

>> EXPLAIN SELECT count(*) FROM usarios WHERE activa;
>> and see how far off the row count estimate in the EXPLAIN is
>> from reality.

> NOTICE: QUERY PLAN:

> Aggregate (cost=498.84 rows=1 width=4)
> -> Seq Scan on usuarios (cost=498.84 rows=1 width=4)

Well, it's sure confused about the selectivity of WHERE activa,
all right.

I tried to duplicate this here, by duplicating the table definition you
sent and filling it with some junk data --- about 1800 rows, 1500 of
which had activa = 't'. I found that after loading the table and
running a plain "vacuum", the system indeed estimated one row out, just
as you show above. But after "vacuum analyze", it estimated 1360 rows
out, which is a lot closer to reality (and would make a big difference
in the plan selected for a join).

Now I know you said you did a "vacuum analyze" on the table, but
I am wondering if maybe you got confused about what you did.
Please try it again just to make sure.

The only other explanation I can think of is that I am not running this
test on a pristine 6.5.2 release, but on a recent CVS update from the
REL6_5 branch. I don't see any indication that anything has been
changed in the selectivity code since 6.5 in that branch, but maybe I
missed something. You might need to update to almost-6.5.3. (I am not
sure if there is a beta-test tarball for 6.5.3 or not; if not, you could
pull the sources from the CVS server, or wait for 6.5.3 which should be
out very soon.)

BTW, current sources (7.0-to-be) get the estimate spot-on after "vacuum
analyze", though without it they are not much better than 6.5. The
current system is estimating 1% of the rows will match, because it's
treating the WHERE condition like "WHERE activa = 't'" and the default
estimate for "=" selectivity is 1% in the absence of VACUUM ANALYZE
statistics. Probably we ought to special-case boolean columns to
default to a 50% estimate if no statistics are available...

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hiroshi Inoue 1999-10-23 23:48:38 System indexes are never unique indexes( was RE: [HACKERS] mdnblocks is an amazing time sink in huge relations)
Previous Message Tom Lane 1999-10-23 22:28:34 Re: [HACKERS] Path-length follies