| From: | Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com> |
|---|---|
| To: | Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com> |
| Cc: | David Geier <geidav(dot)pg(at)gmail(dot)com>, Chengpeng Yan <chengpeng_yan(at)outlook(dot)com>, Tatsuya Kawata <kawatatatsuya0913(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Hash-based MCV matching for large IN-lists |
| Date: | 2026-03-02 21:37:01 |
| Message-ID: | CAN4CZFO3Y25iCqqP_zS1ipgbrBXvAkxeLK2hPuamddyW9ouAzQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello!
+ if (vardata.isunique && vardata.rel && vardata.rel->tuples >= 1.0)
+ {
+ s2 = 1.0 / vardata.rel->tuples;
+ if (HeapTupleIsValid(vardata.statsTuple))
+ {
+ Form_pg_statistic stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+ if (isInequality)
+ s2 = 1.0 - s2 - stats->stanullfrac;
+ }
+ }
Isn't there's a corner case where this if order returns an incorrect
estimate/regression?
See the following test:
CREATE TABLE test AS SELECT generate_series(1, 1000) AS id;
CREATE UNIQUE INDEX ON test(id);
-- no ANALYZE
EXPLAIN SELECT * FROM test WHERE id <> ALL(ARRAY[1, 2, 3]);
-- Actual: rows=1
-- Expected: rows=997
ANALYZE test;
EXPLAIN SELECT * FROM test WHERE id <> ALL(ARRAY[1, 2, 3]);
-- Correct: rows=997
DROP TABLE test;
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Zsolt Parragi | 2026-03-02 21:58:30 | Re: Refactor handling of "-only" options in pg_dump, pg_restore |
| Previous Message | Jeff Davis | 2026-03-02 21:34:23 | Re: [19] CREATE SUBSCRIPTION ... SERVER |