Re: [HACKERS] PATCH: multivariate histograms and MCV lists

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Mark Dilger <hornschnorter(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Date: 2019-03-26 13:37:33
Message-ID: CAEZATCV=W+9LKu=BVbBAgOn3GChGcm=Qdhgco3e3aDxkRqfcVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 26 Mar 2019 at 11:59, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>
> On Mon, 25 Mar 2019 at 23:36, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >
> > Attached is an updated patch...
>
> I just looked through the latest set of changes and I have a couple of
> additional review comments:
>

I just spotted another issue while reading the code:

It's possible to build an MCV list with more than
STATS_MCVLIST_MAX_ITEMS = 8192 items, which then causes an error when
the code tries to read it back in:

create temp table foo(a int, b int);
insert into foo select x,x from generate_series(1,10000) g(x);
insert into foo select x,x from generate_series(1,10000) g(x);
alter table foo alter column a set statistics 10000;
alter table foo alter column b set statistics 10000;
create statistics s (mcv) on a,b from foo;
analyse foo;
select * from foo where a=1 and b=1;

ERROR: invalid length (10000) item array in MCVList

So this needs to be checked when building the MCV list.

In fact, the stats targets for table columns can be as large as 10000
(a hard-coded constant in tablecmds.c, which is pretty ugly, but
that's a different matter), so I think STATS_MCVLIST_MAX_ITEMS
probably ought to match that.

There are also a couple of comments that refer to the 8k limit, which
would need updating, if you change it.

Regards,
Dean

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-03-26 14:03:35 Re: Misleading errors with column references in default expressions and partition bounds
Previous Message Peter Eisentraut 2019-03-26 13:33:32 Re: [HACKERS] generated columns