Re: [HACKERS] PATCH: multivariate histograms and MCV lists

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Mark Dilger <hornschnorter(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Date: 2019-01-17 01:45:20
Message-ID: CAKJS1f-05u7vQY=aCYv2Ffs9a23wt2hwUPvqnofT6kxVLJAAfw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 17 Jan 2019 at 14:19, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> > 12. Should we be referencing the source from the docs?
> >
> > See <function>mcv_clauselist_selectivity</function>
> > in <filename>src/backend/statistics/mcv.c</filename> for details.
> >
> > hmm. I see it's not the first going by: git grep -E "\w+\.c\<"
> > gt
> Hmm, that does not return anything to me - do you actually see any
> references to .c files in the sgml docs? I agree that probably is not a
> good idea, so I'll remove that.

Yeah, I see quite a few. I shouldn't have escaped the <

> > 18. In dependencies_clauselist_selectivity() there seem to be a new
> > bug introduced. We do:
> >
> > /* mark this one as done, so we don't touch it again. */
> > *estimatedclauses = bms_add_member(*estimatedclauses, listidx);
> >
> > but the bms_is_member() check that skipped these has been removed.
> >
> > It might be easier to document if we just always do:
> >
> > if (bms_is_member(listidx, *estimatedclauses))
> > continue;
> >
> > at the start of both loops. list_attnums can just be left unset for
> > the originally already estimatedclauses.
> >
> It's probably not as clear as it should be, but if the clause is already
> estimated (or incompatible), then the list_attnums[] entry will be
> InvalidAttrNumber. Which is what we check in the second loop.

hmm. what about the items that should be skipped when you do the
*estimatedclauses = bms_add_member(*estimatedclauses, listidx); in the
2nd loop. You'll need to either also do list_attnums[listidx] =
InvalidAttrNumber; for them, or put back the bms_is_member() check,
no? I admit to not having debugged it to find an actual bug, it just
looks suspicious.

> > 25. Does statext_is_compatible_clause_internal)_ need to skip over
> RelabelTypes?
> >
> I believe it does, based on what I've observed during development. Why
> do you think it's not necessary?

The other way around. I thought it was necessary, but the code does not do it.

> > 26. In statext_is_compatible_clause_internal() you mention: /* We only
> > support plain Vars for now */, but I see nothing that ensures that
> > only Vars are allowed in the is_opclause() condition.
> >
> > /* see if it actually has the right */
> > ok = (NumRelids((Node *) expr) == 1) &&
> > (is_pseudo_constant_clause(lsecond(expr->args)) ||
> > (varonleft = false,
> > is_pseudo_constant_clause(linitial(expr->args))));
> >
> > the above would allow var+var == const through.
> >
> But then we call statext_is_compatible_clause_internal on it again, and
> that only allows Vars and "Var op Const" expressions. Maybe there's a
> way around that?

True, I missed that. Drop that one.

> > 33. "ndimentions"? There's no field in the struct by that name. I'd
> > assume it's the same size as the isnull array above it?
> >
> > Datum *values; /* variable-length (ndimensions) */
> >
> Yes, that's the case.

If it relates to the ndimensions field from the struct below, maybe
it's worth crafting that into the comment somehow.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2019-01-17 02:12:50 Re: draft patch for strtof()
Previous Message Tomas Vondra 2019-01-17 01:19:16 Re: [HACKERS] PATCH: multivariate histograms and MCV lists