Re: Additional improvements to extended statistics

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Additional improvements to extended statistics
Date: 2020-03-24 21:05:01
Message-ID: CA+hUKGJ2AWQFVR5fOHjxLZqRiTGyUk90EJpqHM5-BdeMBfCR2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 15, 2020 at 3:23 PM Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> On Sun, Mar 15, 2020 at 02:48:02PM +1300, Thomas Munro wrote:
> >Stimulated by some bad plans involving JSON, I found my way to your
> >WIP stats-on-expressions patch in this thread. Do I understand
> >correctly that it will eventually also support single expressions,
> >like CREATE STATISTICS t_distinct_abc (ndistinct) ON
> >(my_jsonb_column->>'abc') FROM t? It looks like that would solve
> >problems that otherwise require a generated column or an expression
> >index just to get ndistinct.
>
> Yes, I think that's generally the plan. I was also thinking about
> inventing some sort of special JSON statistics (e.g. extracting paths
> from the JSONB and computing frequencies, or something like that). But
> stats on expressions are one of the things I'd like to do in PG14.

Interesting idea. If you had simple single-expression statistics, I
suppose a cave-person version of this would be to write a
script/stored procedure that extracts the distinct set of JSON paths
and does CREATE STATISTICS for expressions to access each path. That
said, I suspect that in many cases there's a small set of a paths and
a human designer would know what to do. I didn't manage to try your
WIP stats-on-expressions patch due to bitrot and unfinished parts, but
I am hoping it just needs to remove the "if (numcols < 2)
ereport(ERROR ...)" check to get a very very useful thing.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Maryam Farrukh 2020-03-24 21:25:46 PostgreSQL proposal of Google Summer of Code
Previous Message Tom Lane 2020-03-24 20:29:51 Re: Missing errcode() in ereport