Re: Multivariate MCV list vs. statistics target

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: k(dot)jamison(at)jp(dot)fujitsu(dot)com
Cc: tomas(dot)vondra(at)2ndquadrant(dot)com, dean(dot)a(dot)rasheed(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Multivariate MCV list vs. statistics target
Date: 2019-08-01 08:29:20
Message-ID: 20190801.172920.37913541.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Thu, 1 Aug 2019 00:15:48 +0000, "Jamison, Kirk" <k(dot)jamison(at)jp(dot)fujitsu(dot)com> wrote in <D09B13F772D2274BB348A310EE3027C6518F94(at)g01jpexmbkw24>
> The patch does not apply anymore.
> In addition to the whitespace detected,
> kindly rebase the patch as there were changes from recent commits
> in the following files:
> src/backend/commands/analyze.c
> src/bin/pg_dump/pg_dump.c
> src/bin/pg_dump/t/002_pg_dump.pl
> src/test/regress/expected/stats_ext.out
> src/test/regress/sql/stats_ext.sql

The patch finally failed only for stats_ext.out, where 14ef15a222
is hitting. (for b2a3d706b8)

I looked through this patch and have some comments.

+++ b/src/backend/commands/statscmds.c
+#include "access/heapam.h"
..
+#include "utils/fmgroids.h"

These don't seem needed.

+ Assert(stmt->missing_ok);

Perhaps we shouldn't Assert on this condition. Isn't it better we
just "elog(ERROR" here?

+ DeconstructQualifiedName(stmt->defnames, &schemaname, &statname);

Maybe we don't need detailed analysis that the function emits on
error. Couldn't we use NameListToString() instead? That reduces
the number of ereport()s and considerably simplify the code
around.

+ oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(stxoid));
+
+ /* Must be owner of the existing statistics object */
+ if (!pg_statistics_object_ownercheck(stxoid, GetUserId()))

This repeats the SearchSysCache twice in a quite short
duration. I suppose it'd be better that ACL (and validity) checks
done directly using oldtup.

+ /* replace the stxstattarget column */
+ repl_repl[Anum_pg_statistic_ext_stxstattarget - 1] = true;
+ repl_val[Anum_pg_statistic_ext_stxstattarget - 1] = Int32GetDatum(newtarget)

We usually do this kind of work using SearchSysCacheCopyN(),
which simplifies the code around, too.

+++ b/src/backend/statistics/mcv.c
> * Maximum number of MCV items to store, based on the attribute with the
> * largest stats target (and the number of groups we have available).
> */
- nitems = stats[0]->attr->attstattarget;
- for (i = 1; i < numattrs; i++)
- {
- if (stats[i]->attr->attstattarget > nitems)
- nitems = stats[i]->attr->attstattarget;
- }
+ nitems = stattarget;

Maybe you forgot to modify the comment.

check_xact_readonly() returns false for this command. As the
result it emits a somewhat pointless error message.

=# alter statistics s1 set statistics 0;
ERROR: cannot assign TransactionIds during recovery

+++ b/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.h
> i_stxname = PQfnumber(res, "stxname");
> i_stxnamespace = PQfnumber(res, "stxnamespace");
> i_rolname = PQfnumber(res, "rolname");
+ i_stattarget = PQfnumber(res, "stxstattarget");

I'm not sure whether it is the convention here, but variable name
is different from column name only for the added line.

+++ b/src/bin/psql/tab-complete.c
- COMPLETE_WITH("OWNER TO", "RENAME TO", "SET SCHEMA");
+ COMPLETE_WITH("OWNER TO", "RENAME TO", "SET SCHEMA", "SET STATISTICS");

ALTER STATISTICS s2 SET STATISTICS<tab> is suggested with
ext-stats names, but it's the place for target value.

+++ b/src/include/nodes/nodes.h
T_CallStmt,
+ T_AlterStatsStmt,

I think it should be immediately below T_CreateStatsStmt.

+++ b/src/include/nodes/parsenodes.h
+ bool missing_ok; /* skip error if statistics object is missing */

Should be very trivial, but many bool members especially
missing_ok have a comment having "?" at the end.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-08-01 08:34:19 Re: range_agg
Previous Message Thomas Munro 2019-08-01 08:28:50 Re: Use relative rpath if possible