From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru> |
Subject: | Re: WIP: Covering + unique indexes. |
Date: | 2017-03-30 19:11:38 |
Message-ID: | 20170330191138.zawsxgsxnt2qy2as@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2017-03-30 18:26:05 +0300, Teodor Sigaev wrote:
> Any objection from reviewers to push both patches?
> diff --git a/contrib/bloom/blutils.c b/contrib/bloom/blutils.c
> index f2eda67..59029b9 100644
> --- a/contrib/bloom/blutils.c
> +++ b/contrib/bloom/blutils.c
> @@ -120,6 +120,7 @@ blhandler(PG_FUNCTION_ARGS)
> amroutine->amclusterable = false;
> amroutine->ampredlocks = false;
> amroutine->amcanparallel = false;
> + amroutine->amcaninclude = false;
That name doesn't strike me as very descriptive.
> + <term><literal>INCLUDE</literal></term>
> + <listitem>
> + <para>
> + An optional <literal>INCLUDE</> clause allows a list of columns to be
> + specified which will be included in the non-key portion of the index.
> + Columns which are part of this clause cannot also exist in the
> + key columns portion of the index, and vice versa. The
> + <literal>INCLUDE</> columns exist solely to allow more queries to benefit
> + from <firstterm>index-only scans</> by including certain columns in the
> + index, the value of which would otherwise have to be obtained by reading
> + the table's heap. Having these columns in the <literal>INCLUDE</> clause
> + in some cases allows <productname>PostgreSQL</> to skip the heap read
> + completely. This also allows <literal>UNIQUE</> indexes to be defined on
> + one set of columns, which can include another set of columns in the
> + <literal>INCLUDE</> clause, on which the uniqueness is not enforced.
> + It's the same with other constraints (PRIMARY KEY and EXCLUDE). This can
> + also can be used for non-unique indexes as any columns which are not required
> + for the searching or ordering of records can be used in the
> + <literal>INCLUDE</> clause, which can slightly reduce the size of the index.
> + Currently, only the B-tree access method supports this feature.
> + Expressions as included columns are not supported since they cannot be used
> + in index-only scans.
> + </para>
> + </listitem>
> + </varlistentry>
This could use some polishing.
> +/*
> + * Reform index tuple. Truncate nonkey (INCLUDE) attributes.
> + */
> +IndexTuple
> +index_truncate_tuple(Relation idxrel, IndexTuple olditup)
> +{
> + TupleDesc itupdesc = RelationGetDescr(idxrel);
> + Datum values[INDEX_MAX_KEYS];
> + bool isnull[INDEX_MAX_KEYS];
> + IndexTuple newitup;
> + int indnatts = IndexRelationGetNumberOfAttributes(idxrel);
> + int indnkeyatts = IndexRelationGetNumberOfKeyAttributes(idxrel);
> +
> + Assert(indnatts <= INDEX_MAX_KEYS);
> + Assert(indnkeyatts > 0);
> + Assert(indnkeyatts < indnatts);
> +
> + index_deform_tuple(olditup, itupdesc, values, isnull);
> +
> + /* form new tuple that will contain only key attributes */
> + itupdesc->natts = indnkeyatts;
> + newitup = index_form_tuple(itupdesc, values, isnull);
> + newitup->t_tid = olditup->t_tid;
> +
> + itupdesc->natts = indnatts;
Uh, isn't this a *seriously* bad idea? If index_form_tuple errors out,
this'll corrupt the tuple descriptor.
Maybe also rename the function to index_build_key_tuple()?
> * Construct a string describing the contents of an index entry, in the
> * form "(key_name, ...)=(key_value, ...)". This is currently used
> - * for building unique-constraint and exclusion-constraint error messages.
> + * for building unique-constraint and exclusion-constraint error messages,
> + * so only key columns of index are checked and printed.
s/index/the index/
> @@ -368,7 +370,7 @@ systable_beginscan(Relation heapRelation,
> {
> int j;
>
> - for (j = 0; j < irel->rd_index->indnatts; j++)
> + for (j = 0; j < IndexRelationGetNumberOfAttributes(irel); j++)
> {
> if (key[i].sk_attno == irel->rd_index->indkey.values[j])
> {
> @@ -376,7 +378,7 @@ systable_beginscan(Relation heapRelation,
> break;
> }
> }
> - if (j == irel->rd_index->indnatts)
> + if (j == IndexRelationGetNumberOfAttributes(irel))
> elog(ERROR, "column is not in index");
> }
Not that it matters overly much, but why are we doing this for all
attributes, rather than just key attributes?
> --- a/src/backend/bootstrap/bootstrap.c
> +++ b/src/backend/bootstrap/bootstrap.c
> @@ -600,7 +600,7 @@ boot_openrel(char *relname)
> relname, (int) ATTRIBUTE_FIXED_PART_SIZE);
>
> boot_reldesc = heap_openrv(makeRangeVar(NULL, relname, -1), NoLock);
> - numattr = boot_reldesc->rd_rel->relnatts;
> + numattr = RelationGetNumberOfAttributes(boot_reldesc);
> for (i = 0; i < numattr; i++)
> {
> if (attrtypes[i] == NULL)
That seems a bit unrelated.
> @@ -2086,7 +2086,8 @@ StoreRelCheck(Relation rel, char *ccname, Node *expr,
> is_validated,
> RelationGetRelid(rel), /* relation */
> attNos, /* attrs in the constraint */
> - keycount, /* # attrs in the constraint */
> + keycount, /* # key attrs in the constraint */
> + keycount, /* # total attrs in the constraint */
> InvalidOid, /* not a domain constraint */
> InvalidOid, /* no associated index */
> InvalidOid, /* Foreign key fields */
It doesn't quite seem right to me to store this both in pg_index and
pg_constraint.
> @@ -340,14 +341,27 @@ DefineIndex(Oid relationId,
> numberOfAttributes = list_length(stmt->indexParams);
> - if (numberOfAttributes <= 0)
> - ereport(ERROR,
> - (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
> - errmsg("must specify at least one column")));
> +
Huh, why's that check gone?
>
> +opt_c_include: INCLUDE '(' columnList ')' { $$ = $3; }
> + | /* EMPTY */ { $$ = NIL; }
> + ;
> +opt_include: INCLUDE '(' index_including_params ')' { $$ = $3; }
> + | /* EMPTY */ { $$ = NIL; }
> + ;
> +
> +index_including_params: index_elem { $$ = list_make1($1); }
> + | index_including_params ',' index_elem { $$ = lappend($1, $3); }
> + ;
> +
Why do we have multiple different definitions of this?
> @@ -1979,6 +2017,48 @@ transformIndexConstraint(Constraint *constraint, CreateStmtContext *cxt)
> index->indexParams = lappend(index->indexParams, iparam);
> }
>
> + /* Here is some code duplication. But we do need it. */
Aha?
> + foreach(lc, constraint->including)
> + {
> + char *key = strVal(lfirst(lc));
> + bool found = false;
> + ColumnDef *column = NULL;
> + ListCell *columns;
> + IndexElem *iparam;
> +
> + foreach(columns, cxt->columns)
> + {
> + column = (ColumnDef *) lfirst(columns);
> + Assert(IsA(column, ColumnDef));
> + if (strcmp(column->colname, key) == 0)
> + {
> + found = true;
> + break;
> + }
> + }
> +
> + /*
> + * In the ALTER TABLE case, don't complain about index keys not
> + * created in the command; they may well exist already. DefineIndex
> + * will complain about them if not, and will also take care of marking
> + * them NOT NULL.
> + */
Uh. Why should they be marked as NOT NULL? ISTM the comment has been
copied here without adjustments.
> @@ -1275,6 +1275,21 @@ pg_get_indexdef_worker(Oid indexrelid, int colno,
> Oid keycoltype;
> Oid keycolcollation;
>
> + /*
> + * attrsOnly flag is used for building unique-constraint and
> + * exclusion-constraint error messages. Included attrs are
> + * meaningless there, so do not include them in the message.
> + */
> + if (attrsOnly && keyno >= idxrec->indnkeyatts)
> + break;
Sounds like the parameter should be renamed then.
> +Included attributes in B-tree indexes
> +-------------------------------------
> +
> +Since 10.0 there is an optional INCLUDE clause, that allows to add
10.0 isn't right, since that's the "patch" version now.
> +a portion of non-key attributes to index. They exist to allow more queries
> +to benefit from index-only scans. We never use included attributes in
> +ScanKeys, neither for search nor for inserts. That allows us to include
> +into B-tree any datatypes, even those which don't have suitable opclass.
> +Included columns only stored in regular items on leaf pages. All inner
> +keys and high keys are truncated and contain only key attributes.
> +That helps to reduce the size of index.
s/index/the index/
> @@ -537,6 +542,28 @@ _bt_buildadd(BTWriteState *wstate, BTPageState *state, IndexTuple itup)
> ItemIdSetUnused(ii); /* redundant */
> ((PageHeader) opage)->pd_lower -= sizeof(ItemIdData);
>
> + if (indnkeyatts != indnatts && P_ISLEAF(opageop))
> + {
> + /*
> + * It's essential to truncate High key here.
> + * The purpose is not just to save more space on this particular page,
> + * but to keep whole b-tree structure consistent. Subsequent insertions
> + * assume that hikey is already truncated, and so they should not
> + * worry about it, when copying the high key into the parent page
> + * as a downlink.
s/should/need/
> + * NOTE It is not crutial for reliability in present,
s/crutial/crucial/
> + * but maybe it will be that in the future.
> + */
"it's essential" ... "it is not crutial" -- that's contradictory.
> + keytup = index_truncate_tuple(wstate->index, oitup);
The code in _bt_split previously claimed that it's the only place doing
truncation...
> + /* delete "wrong" high key, insert keytup as P_HIKEY. */
> + PageIndexTupleDelete(opage, P_HIKEY);
> + if (!_bt_pgaddtup(opage, IndexTupleSize(keytup), keytup, P_HIKEY))
> + elog(ERROR, "failed to rewrite compressed item in index \"%s\"",
> + RelationGetRelationName(wstate->index));
Hm...
- Andres
From | Date | Subject | |
---|---|---|---|
Next Message | Aleksander Alekseev | 2017-03-30 19:27:06 | Re: WIP: Covering + unique indexes. |
Previous Message | Thomas Munro | 2017-03-30 19:02:35 | Re: strange parallel query behavior after OOM crashes |