Re: Multivariate MCV list vs. statistics target

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Multivariate MCV list vs. statistics target
Date: 2019-06-20 22:12:50
Message-ID: 20190620221250.jk62m4j7kr77qkzg@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 20, 2019 at 08:08:44AM +0100, Dean Rasheed wrote:
>On Tue, 18 Jun 2019 at 22:34, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>
>> One slightly inconvenient thing I realized while playing with the
>> address data set is that it's somewhat difficult to set the desired size
>> of the multi-column MCV list.
>>
>> At the moment, we simply use the maximum statistic target for attributes
>> the MCV list is built on. But that does not allow keeping default size
>> for per-column stats, and only increase size of multi-column MCV lists.
>>
>> So I'm thinking we should allow tweaking the statistics for extended
>> stats, and serialize it in the pg_statistic_ext catalog. Any opinions
>> why that would be a bad idea?
>>
>
>Seems reasonable to me. This might not be the only option we'll ever
>want to add though, so perhaps a "stxoptions text[]" column along the
>lines of a relation's reloptions would be the way to go.
>

I don't know - I kinda dislike the idea of stashing stuff like this into
text[] arrays unless there's a clear need for such flexibility (i.e.
vision to have more such options). Which I'm not sure is the case here.
And we kinda have a precedent in pg_attribute.attstattarget, so I'd use
the same approach here.

>> I suppose it should be part of the CREATE STATISTICS command, but I'm
>> not sure what'd be the best syntax. We might also have something more
>> similar to ALTER COLUMNT, but perhaps
>>
>> ALTER STATISTICS s SET STATISTICS 1000;
>>
>> looks a bit too weird.
>>
>
>Yes it does look a bit weird, but that's the natural generalisation of
>what we have for per-column statistics, so it's probably preferable to
>do that rather than invent some other syntax that wouldn't be so
>consistent.
>

Yeah, I agree.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-06-20 22:35:48 Re: Choosing values for multivariate MCV lists
Previous Message Tom Lane 2019-06-20 21:59:56 Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.)