From: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
---|---|
To: | Ron Johnson <ronljohnsonjr(at)gmail(dot)com>, pgsql-admin <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: pg_stats.correlation rule of thumb for re-clustering a table? |
Date: | 2025-09-12 20:59:07 |
Message-ID: | 2abf54a409ea1a3aac778040cf2035d2b9230e22.camel@cybertec.at |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
On Fri, 2025-09-12 at 10:46 -0400, Ron Johnson wrote:
> Purely OLTP tables (that are only accessed randomly) can of course live with 0% correlation,
> but lots of tables are mixed-use, and so benefit from physical ordering on a carefully chosen field..
>
> SELECT abs(correlation)::numeric(3,2) as correlation
> FROM pg_stats
> WHERE schemaname = 'foo' AND tablename = 'bar'
> AND attname = 'blarge';
> correlation
> -------------
> 0.84
> (1 row)
>
> Obviously 84% is no need to worry, but what about 60% or 40%? Currently, I use 60%, but would like to do better.
Either the difference is gradual, so that it there is no real cut-off point,
or there is a sudden plan change at some point that depends on the query the
data and the parameter settings. I don't think it is possible to give reliable
numbers that cover all cases.
I suggest that you run a series of benchmarks with a copy of the table with
different correlation values and come up with numbers that are meaningful
for your individual case.
Yours,
Laurenz Albe
From | Date | Subject | |
---|---|---|---|
Next Message | OMPRAKASH SAHU | 2025-09-16 07:18:14 | Replication using patroni on different ubuntu os versions |
Previous Message | Ron Johnson | 2025-09-12 14:46:30 | pg_stats.correlation rule of thumb for re-clustering a table? |