Re: multivariate statistics v14

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: multivariate statistics v14
Date: 2016-03-09 15:02:59
Message-ID: 1457535779.24545.71.camel@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi,

thanks for the feedback. Attached is v14 of the patch series, fixing
most of the points you've raised.

On Wed, 2016-03-09 at 09:22 -0300, Alvaro Herrera wrote:
> Hi,
>
> I gave a very quick skim to patch 0002. Not a real review yet. But
> there are a few trivial points to fix:
>
> * You still have empty sections in the SGML docs (such as the EXAMPLES).
> I suppose the syntax is now firm enough that we can get some. (I looked
> at the other patches to see whether it was filled in, but couldn't find
> any additional text there.)

Yes, that's one of the items I plan to work on next. Until now the
regression tests were a sufficient source of examples, but it's time to
do the SGML piece.

>
> * check_object_ownership() needs to be filled in

Done.

I've added pg_statistics_ownercheck, which also required adding OID of
the owner to the catalog. Initially the plan was to use the same owner
as for the table, but now that we've switched to CREATE STATISTICS
partially because it will allow multi-table stats, that does not make
sense (multiple tables with different owners).

This probably means we also need an 'ALTER STATISTICS ... OWNER TO'
command, which does not exist at this point.

>
> * Since you're adding a new object type, please add a case to cover it
> in the object_address.sql pg_regress test.

Done.

Apparently there was a bunch of missing pieces in objectaddress.c, so
this adds them too.

>
> * in analyze.c (and elsewhere), please put new #include lines sorted.

Done.

I've also significantly reduced the excessive list of includes in
statscmds.c. I expect the headers to require a bit more love, especially
in the subsequent patches (MCV, histograms etc.).

>
> * I think the AT_PASS_ADD_STATS is a leftover which should be removed.

Yeah. Now that we've invented CREATE TABLE, all the changes to
tablecmds.c were just unnecessary leftovers. Removed.

>
> * The XXX comment in get_relation_info should probably be handled
> differently (namely, in a way that makes the syscache not contain OIDs
> of dropped stats)

I believe that was actually an obsolete comment. Removed.

>
> * The README.dependencies has a lot of TODOs. Do we need to get them
> done during the first cut? If not, I suggest creating a new section
> "Future work" in the file.

Right. Most of those TODOs are future work, or rather ideas (more or
less crazy). The one thing I definitely want to address now is support
for dependencies with multiple columns on the left side, because that
requires changes to serialized format. I might also look at handling IS
NULL clauses, but that may wait.

>
> * Please put the common.h header in src/include. Make sure not to
> include "postgres.h" in it -- our policy is that postgres.h goes at the
> top of every .c file and never in any .h file. Also please find a
> better name for it; even mvstats_common.h would be a lot more
> convincing. However:
>
> * ISTM that the code in common.c properly belongs in
> src/backend/catalog/pg_mvstats.c instead (or more properly
> catalog/pg_mv_statistics.c), which probably means the common.h file
> should be named something else; perhaps some of it could become
> pg_mv_statistic_fn.h, while the rest continues to be
> src/include/utils/mvstats_common.h? Not sure.

Hmmm, not sure either. The idea was that the "common.h" is pretty much
just a private header with stuff that's not very useful anywhere else.

No changes here, for now.

>
> * The version check in psql/describe.c uses 90500; should probably be
> updated to 90600.

Fixed.

>
> * _copyCreateStatsStmt is missing if_not_exists

Fixed.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
0001-teach-pull_-varno-varattno-_walker-about-RestrictInf.patch text/x-patch 1.4 KB
0002-shared-infrastructure-and-functional-dependencies.patch text/x-patch 110.7 KB
0003-clause-reduction-using-functional-dependencies.patch text/x-patch 47.7 KB
0004-multivariate-MCV-lists.patch text/x-patch 112.8 KB
0005-multivariate-histograms.patch text/x-patch 136.6 KB
0006-multi-statistics-estimation.patch text/x-patch 98.3 KB
0007-multivariate-ndistinct-coefficients.patch text/x-patch 28.5 KB
0008-change-how-we-apply-selectivity-to-number-of-groups-.patch text/x-patch 1.4 KB
0009-fixup-of-regression-tests-plans-changes-by-group-by-.patch text/x-patch 5.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2016-03-09 15:08:37 Re: Crash with old Windows on new CPU
Previous Message Alvaro Herrera 2016-03-09 15:02:08 Re: More stable query plans via more predictable column statistics