Re: Choosing values for multivariate MCV lists

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Choosing values for multivariate MCV lists
Date: 2019-07-01 11:02:28
Message-ID: CAEZATCWBXQ=P0oTEOauLYKWW_s2di0ZvTZGwZ_6_KaHwOrJgQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 29 Jun 2019 at 14:01, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
> >>>>>However, it looks like the problem is with mcv_list_items()'s use
> >>>>>of %f to convert to text, which is pretty ugly.
> >>>>
> >>>There's one issue with the signature, though - currently the function
> >>>returns null flags as bool array, but values are returned as simple
> >>>text value (formatted in array-like way, but still just a text).
> >>>
> >>IMO fixing this to return a text array is worth doing, even though it
> >>means a catversion bump.
> >>
> Attached is a cleaned-up version of that patch. The main difference is
> that instead of using construct_md_array() this uses ArrayBuildState to
> construct the arrays, which is much easier. The docs don't need any
> update because those were already using text[] for the parameter, the
> code was inconsistent with it.
>

Cool, this looks a lot neater and fixes the issues discussed with both
floating point values no longer being converted to text, and proper
text arrays for values.

One minor nitpick -- it doesn't seem to be necessary to build the
arrays *outfuncs and *fmgrinfo. You may as well just fetch the
individual column output function information at the point where it's
used (in the "if (!item->isnull[i])" block) rather than building those
arrays.

> This does require catversion bump, but as annoying as it is, I think
> it's worth it (and there's also the thread discussing the serialization
> issues). Barring objections, I'll get it committed later next week, once
> I get back from PostgresLondon.
>
> As I mentioned before, if we don't want any additional catversion bumps,
> it's possible to pass the arrays through output functions - that would
> allow us keeping the text output (but correct, unlike what we have now).
>

I think this is a clear bug fix, so I'd vote for fixing it properly
now, with a catversion bump.

Regards,
Dean

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-07-01 11:03:03 Re: How to estimate the shared memory size required for parallel scan?
Previous Message Jamison, Kirk 2019-07-01 10:55:49 RE: [PATCH] Speedup truncates of relation forks