From: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Choosing values for multivariate MCV lists |
Date: | 2019-06-24 13:54:01 |
Message-ID: | CAEZATCU5O1y09w0u4BpV5sY=X+KGomuwP5vvjdp8QFWTW+VDTQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, 24 Jun 2019 at 00:42, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
> On Sun, Jun 23, 2019 at 10:23:19PM +0200, Tomas Vondra wrote:
> >On Sun, Jun 23, 2019 at 08:48:26PM +0100, Dean Rasheed wrote:
> >>On Sat, 22 Jun 2019 at 15:10, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >>>One annoying thing I noticed is that the base_frequency tends to end up
> >>>being 0, most likely due to getting too small. It's a bit strange, though,
> >>>because with statistic target set to 10k the smallest frequency for a
> >>>single column is 1/3e6, so for 2 columns it'd be ~1/9e12 (which I think is
> >>>something the float8 can represent).
> >>>
> >>
> >>Yeah, it should be impossible for the base frequency to underflow to
> >>0. However, it looks like the problem is with mcv_list_items()'s use
> >>of %f to convert to text, which is pretty ugly.
> >>
> >
> >Yeah, I realized that too, eventually. One way to fix that would be
> >adding %.15f to the sprintf() call, but that just adds ugliness. It's
> >probably time to rewrite the function to build the tuple from datums,
> >instead of relying on BuildTupleFromCStrings.
> >
>
> OK, attached is a patch doing this. It's pretty simple, and it does
> resolve the issue with frequency precision.
>
> There's one issue with the signature, though - currently the function
> returns null flags as bool array, but values are returned as simple
> text value (formatted in array-like way, but still just a text).
>
> In the attached patch I've reworked both to proper arrays, but obviously
> that'd require a CATVERSION bump - and there's not much apetite for that
> past beta2, I suppose. So I'll just undo this bit.
>
Hmm, I didn't spot that the old code was using a single text value
rather than a text array. That's clearly broken, especially since it
wasn't even necessarily constructing a valid textual representation of
an array (e.g., if an individual value's textual representation
included the array markers "{" or "}").
IMO fixing this to return a text array is worth doing, even though it
means a catversion bump.
Regards,
Dean
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2019-06-24 13:57:34 | Re: Tweaking DSM and DSA limits |
Previous Message | Tomas Vondra | 2019-06-24 13:04:10 | Re: Index Skip Scan |