Re: extended stats not friendly towards ANALYZE with subset of columns

From: Noah Misch <noah(at)leadboat(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: extended stats not friendly towards ANALYZE with subset of columns
Date: 2017-04-05 06:55:18
Message-ID: 20170405065518.GC2702846@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 28, 2017 at 10:30:03PM +1300, David Rowley wrote:
> I'm just reviewing Tomas' code for the dependencies part of the stats
> when I saw something that looked a bit unusual.
>
> I tested with:
>
> CREATE TABLE ab1 (a INTEGER, b INTEGER);
> ALTER TABLE ab1 ALTER a SET STATISTICS 0;
> INSERT INTO ab1 SELECT a, a%23 FROM generate_series(1, 1000) a;
> CREATE STATISTICS ab1_a_b_stats ON (a, b) FROM ab1;
> ANALYZE ab1;
>
> And got:
>
> ERROR: extended statistics could not be collected for column "a" of
> relation public.ab1
> HINT: Consider ALTER TABLE "public"."ab1" ALTER "a" SET STATISTICS -1
>
> I don't think the error is useful here, as it means nothing gets done.
> Probably better to just not (re)build those stats.
>
> Another option would be to check for extended stats before deciding
> which rows to ANALYZE, then still gathering the columns required for
> MV stats, but I think if the user asks for a subset of columns to be
> analyzed, and that causes a column to be missing for an extended
> statistics, that it would be pretty surprising if we rebuild the
> extended stats.
>
> Perhaps the SET STATISTIC 0 for a column still needs to gather data
> for extended statistics, though. Perhaps a debate should ensue about
> how that should work exactly.
>
> I've attached a patch which fixes the problem above, but it does
> nothing to change the analyze behaviour for 0 statistics columns.

[Action required within three days. This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item. Álvaro,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.

[1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2017-04-05 06:56:56 Re: [pgsql-www] Small issue in online devel documentation build
Previous Message Noah Misch 2017-04-05 06:49:41 Re: [COMMITTERS] pgsql: Sync pg_dump and pg_dumpall output