Re: planner support functions: handle GROUP BY estimates ?

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: planner support functions: handle GROUP BY estimates ?
Date: 2020-11-16 17:24:41
Message-ID: c4c19131-fc64-d53f-9586-ccbeeff4b717@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/15/20 12:44 AM, Tom Lane wrote:
> Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>> On Tue, Jan 14, 2020 at 05:37:53PM -0500, Tom Lane wrote:
>>> I wonder just how messy it would be to add a column to pg_statistic_ext
>>> whose type is the composite type "pg_statistic", and drop the required
>>> data into that. We've not yet used any composite types in the system
>>> catalogs, AFAIR, but since pg_statistic_ext isn't a bootstrap catalog
>>> it seems like we might be able to get away with it.
>
> [ I meant pg_statistic_ext_data, obviously ]
>
>> I don't know, but feels a bit awkward to store this type of stats into
>> pg_statistic_ext, which was meant for multi-column stats. Maybe it'd
>> work fine, not sure.
>
> If we wanted to allow a single statistics object to contain data for
> multiple expressions, we'd actually need that to be array-of-pg_statistic
> not just pg_statistic. Seems do-able, but on the other hand we could
> just prohibit having more than one output column in the "query" for this
> type of extended statistic. Either way, this seems far less invasive
> than either a new catalog or a new relation relkind (to say nothing of
> needing both, which is where you seemed to be headed).
>

I've started looking at statistics on expressions too, mostly because it
seems the extended stats improvements (as discussed in [1]) need that.

The "stash pg_statistic records into pg_statistics_ext_data" approach
seems simple, but it's not clear to me how to make it work, so I'd
appreciate some guidance.

1) Considering we don't have any composite types in any catalog yet, and
naive attempts to just use something like

pg_statistic stxdexprs[1];

did not work. So I suppose this will require changes to genbki.pl, but
honestly, my Perl-fu is non-existent :-(

2) Won't it be an issue that pg_statistic contains pseudo-types? That
is, this does not work, for example:

test=# create table t (a pg_statistic[]);
ERROR: column "stavalues1" has pseudo-type anyarray

and it seems unlikely just using this in a catalog would make it work.

regards

[1]
https://www.postgresql.org/message-id/ad7891d2-e90c-b446-9fe2-7419143847d7%40enterprisedb.com

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2020-11-16 18:24:46 Re: remove spurious CREATE INDEX CONCURRENTLY wait
Previous Message Merlin Moncure 2020-11-16 16:51:40 Re: Zedstore - compressed in-core columnar storage