Re: planner support functions: handle GROUP BY estimates ?

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: planner support functions: handle GROUP BY estimates ?
Date: 2020-01-14 21:45:21
Message-ID: 20200114214521.lojkdad36rzmhj3y@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 14, 2020 at 04:21:57PM -0500, Tom Lane wrote:
>Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>> On Tue, Jan 14, 2020 at 03:12:21PM -0500, Tom Lane wrote:
>>> cc'ing Tomas in case he has any thoughts about it.
>
>> Well, I certainly do thoughts about this - it's pretty much exactly what
>> I proposed yesterday in this thread:
>> https://www.postgresql.org/message-id/flat/20200113230008(dot)g67iyk4cs3xbnjju(at)development
>> The third part of that patch series is exactly about supporting extended
>> statistics on expressions, about the way you described here. The current
>> status of the WIP patch is that grammar + ANALYZE mostly works, but
>> there is no support in the planner. It's obviously still very hackish.
>
>Cool. We should probably take the discussion to that thread, then.
>
>> I'm also wondering if we could/should 100% rely on extended statistics,
>> because those are really meant to track correlations between columns,
>
>Yeah, it seems likely to me that the infrastructure for this would be
>somewhat different --- the user-facing syntax could be basically the
>same, but ultimately we want to generate entries in pg_statistic not
>pg_statistic_ext_data. Or at least entries that look the same as what
>you could find in pg_statistic.
>

Yeah. I think we could invent a new type of statistics "expressions"
which would simply built this per-column stats. So for example

CREATE STATISTICS s (expressions) ON (a*b), sqrt(c) FROM t;

would build per-column stats stored in pg_statistics, while

CREATE STATISTICS s (mcv) ON (a*b), sqrt(c) FROM t;

would build the multi-column MCV list on expressions.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-01-14 21:52:44 Re: planner support functions: handle GROUP BY estimates ?
Previous Message Peter Eisentraut 2020-01-14 21:34:10 Re: Remove libpq.rc, use win32ver.rc for libpq