| From: | Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com> |
|---|---|
| To: | Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>, David Geier <geidav(dot)pg(at)gmail(dot)com>, Chengpeng Yan <chengpeng_yan(at)outlook(dot)com>, Tatsuya Kawata <kawatatatsuya0913(at)gmail(dot)com> |
| Cc: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Hash-based MCV matching for large IN-lists |
| Date: | 2026-03-02 09:35:15 |
| Message-ID: | 03aa576f-13bb-4ed1-bee4-0ee69ad8813c@tantorlabs.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
I've addressed the previously mentioned issues in v7 patches.
I also retested the hash-based MCV path using bytea as the data type.
```
CREATE TABLE t (val bytea);
INSERT INTO t SELECT int4send(i) FROM generate_series(1, 10000) AS i,
generate_series(1, 50);
ALTER TABLE t ALTER COLUMN val SET STATISTICS 10000;
ANALYZE t;
SELECT string_agg(format('int4send(%s)', v), ',') FROM
generate_series(1, 10000) AS gs(v) \gset
EXPLAIN (SUMMARY) SELECT * FROM t WHERE val =
ANY(ARRAY[:string_agg]::bytea[]);
```
Planning Time Speedup
default_statistics_target | Before (ms) | After (ms) | Speedup (x)
--------------------------------------------------------------------
100 | 0.984 | 0.697 | 1.41
500 | 1.260 | 0.984 | 1.28
1000 | 4.183 | 1.825 | 2.29
2500 | 64.715 | 1.298 | 49.86
5000 | 251.619 | 4.751 | 52.96
7500 | 562.775 | 2.895 | 194.40
10000 | 998.330 | 3.561 | 280.36
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC,
https://tantorlabs.com/
| Attachment | Content-Type | Size |
|---|---|---|
| v7-0003-Use-hash-based-MCV-matching-for-ScalarArrayOpExpr.patch | text/x-patch | 17.5 KB |
| v7-0002-Use-O-1-selectivity-formula-for-eqsel-neqsel-IN-A.patch | text/x-patch | 6.4 KB |
| v7-0001-Reduce-planning-time-for-large-NOT-IN-lists-conta.patch | text/x-patch | 4.9 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jakub Wartak | 2026-03-02 10:16:40 | Re: pg_buffercache: Add per-relation summary stats |
| Previous Message | Chao Li | 2026-03-02 09:32:55 | Question: rebuilding frontend tools after libpgfeutils.a changes? |