From: | Mark Kirkwood <markir(at)paradise(dot)net(dot)nz> |
---|---|
To: | chris smith <dmagick(at)gmail(dot)com> |
Cc: | "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Query using SeqScan instead of IndexScan |
Date: | 2006-04-02 04:31:34 |
Message-ID: | 442F53A6.9090707@paradise.net.nz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
chris smith wrote:
> <rant>
> It'd be nice if the database developers agreed on what terms meant.
>
> http://dev.mysql.com/doc/refman/5.1/en/myisam-index-statistics.html
>
> The SHOW INDEX statement displays a cardinality value based on N/S,
> where N is the number of rows in the table and S is the average value
> group size. That ratio yields an approximate number of value groups in
> the table.
> </rant>
>
> A work colleague found that information a few weeks ago so that's
> where my misunderstanding came from - if I'm reading that right they
> use n_distinct as their "cardinality" basis.. then again I could be
> reading that completely wrong too.
>
Yeah that's right - e.g using the same table in postgres and mysql:
pgsql> SELECT attname,n_distinct,correlation
FROM pg_stats
WHERE tablename='fact0'
AND attname LIKE 'd%key';
attname | n_distinct | correlation
---------+------------+-------------
d0key | 10000 | -0.0211169
d1key | 100 | 0.124012
d2key | 10 | 0.998393
(3 rows)
mysql> SHOW INDEX FROM fact0
-> ;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name |
Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| fact0 | 1 | fact0_d0key | 1 | d0key | A
| 10000 | NULL | NULL | | BTREE | |
| fact0 | 1 | fact0_d1key | 1 | d1key | A
| 100 | NULL | NULL | | BTREE | |
| fact0 | 1 | fact0_d2key | 1 | d2key | A
| 10 | NULL | NULL | | BTREE | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
3 rows in set (0.00 sec)
It is a bit confusing - '(distinct) cardinality' might be a better
heading for their 'cardinality' column!
On the correlation business - I don't think Mysql calculates it (or if
it does, its not displayed).
> I believe postgres (because it's a lot more standards compliant).. but
> sheesh - what a difference!
>
Well yes - however, to be fair to the Mysql guys, AFAICS the capture and
display of index stats (and any other optimizer related data) is not
part of any standard.
Cheers
Mark
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-04-02 04:46:09 | Re: Query using SeqScan instead of IndexScan |
Previous Message | Alvaro Herrera | 2006-04-02 04:26:45 | Re: Query using SeqScan instead of IndexScan |