Quick Links

Re: Multidimensional Histograms

From:	Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To:	Alexander Cheshev <alex(dot)cheshev(at)gmail(dot)com>
Cc:	Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org, Teodor Sigaev <teodor(at)postgrespro(dot)ru>
Subject:	Re: Multidimensional Histograms
Date:	2024-01-10 15:49:48
Message-ID:	2ba586f7-ae89-4255-bb6e-37af0808b413@postgrespro.ru
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 8/1/2024 16:21, Alexander Cheshev wrote:
> Hi Andrei,
>
>> Maybe my wording needed to be more precise. I didn't implement
>> multidimensional histograms before, so I don't know how expensive they
>> are. I meant that for dependency statistics over about six columns, we
>> have a lot of combinations to compute.
>
> Equi-Depth Histogram in a 6 dimensional case requires 6 times more
> iterations. Postgres already uses Equi-Depth Histogram. Even if you
> increase the number of buckets from 100 to 1000 then there will be no
> overhead as the time complexity of Equi-Depth Histogram has no
> dependence on the number of buckets. So, no overhead at all!

Maybe. For three columns, we have 9 combinations (passes) for building
dependency statistics and 4 combinations for ndistincts; for six
columns, we have 186 and 57 combinations correspondingly.
Even remembering that dependency is just one number for one combination,
building the dependency statistics is still massive work. So, in the
multicolumn case, having something like a histogram may be more effective.

--
regards,
Andrei Lepikhov
Postgres Professional

In response to

Re: Multidimensional Histograms at 2024-01-08 09:21:34 from Alexander Cheshev

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrei Lepikhov	2024-01-10 15:59:02	Re: Custom explain options
Previous Message	Andrey M. Borodin	2024-01-10 15:13:12	Re: [PATCH] Add sortsupport for range types and btree_gist