Re: [GSoC2014] Patch ALTER TABLE ... SET LOGGED

From: Christoph Berg <cb(at)df7cb(dot)de>
To: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: [GSoC2014] Patch ALTER TABLE ... SET LOGGED
Date: 2014-07-08 19:53:00
Message-ID: 20140708195300.GD10133@msg.df7cb.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

here's my review for this patch.

Generally, the patch addresses a real need, tables often only turn
out to be desired as unlogged if there are performance problems in
practice, and the other way round changing an unlogged table to logged
is way more involved manually than it could be with this patch. So
yes, we want the feature.

I've tried the patch, and it works as desired, including tab
completion in psql. There are a few issues to be resolved, though.

Re: Fabrízio de Royes Mello 2014-06-26 <CAFcNs+pyV0PBjLo97OSyqV1yQOC7s+JGvWXc8UnY5jSRDwy45A(at)mail(dot)gmail(dot)com>
> As part of GSoC2014 and with agreement of my mentor and reviewer (Stephen
> Frost) I've send a complement of the first patch to add the capability to
> change a regular table to unlogged.

(Fwiw, I believe this direction is the more interesting one.)

> With this patch we finish the main goals of the GSoC2014 and now we'll work
> in the additional goals.

... that being the non-WAL-logging with wal_level=minimal, or more?

> diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
> index 69a1e14..424f2e9 100644
> --- a/doc/src/sgml/ref/alter_table.sgml
> +++ b/doc/src/sgml/ref/alter_table.sgml
> @@ -58,6 +58,7 @@ ALTER TABLE [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
> ENABLE REPLICA RULE <replaceable class="PARAMETER">rewrite_rule_name</replaceable>
> ENABLE ALWAYS RULE <replaceable class="PARAMETER">rewrite_rule_name</replaceable>
> CLUSTER ON <replaceable class="PARAMETER">index_name</replaceable>
> + SET {LOGGED | UNLOGGED}
> SET WITHOUT CLUSTER
> SET WITH OIDS
> SET WITHOUT OIDS

This must not be between the two CLUSTER lines. I think the best spot
would just be one line down, before SET WITH OIDS.

(While we are at it, SET TABLESPACE should probably be moved up to the
other SET options...)

> @@ -432,6 +433,20 @@ ALTER TABLE [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
> </varlistentry>
>
> <varlistentry>
> + <term><literal>SET {LOGGED | UNLOGGED}</literal></term>
> + <listitem>
> + <para>
> + This form change the table persistence type from unlogged to permanent or

This grammar bug pops up consistently: This form *changes*...

There's trailing whitespace.

> + from unlogged to permanent by rewriting the entire contents of the table
> + and associated indexes into new disk files.
> + </para>
> + <para>
> + Changing the table persistence type acquires an <literal>ACCESS EXCLUSIVE</literal> lock.
> + </para>

I'd rewrite that and add a reference:

... from unlogged to permanent (see <xref linkend="SQL-CREATETABLE-UNLOGGED">).
</para>
<para>
Changing the table persistence type acquires an <literal>ACCESS EXCLUSIVE</literal> lock
and rewrites the table contents and associated indexes into new disk files.
</para>

> diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
> index 60d387a..9dfdca2 100644
> --- a/src/backend/commands/tablecmds.c
> +++ b/src/backend/commands/tablecmds.c
> @@ -125,18 +125,19 @@ static List *on_commits = NIL;
> * a pass determined by subcommand type.
> */
>
> -#define AT_PASS_UNSET -1 /* UNSET will cause ERROR */
> -#define AT_PASS_DROP 0 /* DROP (all flavors) */
> -#define AT_PASS_ALTER_TYPE 1 /* ALTER COLUMN TYPE */
> -#define AT_PASS_OLD_INDEX 2 /* re-add existing indexes */
> -#define AT_PASS_OLD_CONSTR 3 /* re-add existing constraints */
> -#define AT_PASS_COL_ATTRS 4 /* set other column attributes */
> +#define AT_PASS_UNSET -1 /* UNSET will cause ERROR */
> +#define AT_PASS_DROP 0 /* DROP (all flavors) */
> +#define AT_PASS_ALTER_TYPE 1 /* ALTER COLUMN TYPE */
> +#define AT_PASS_OLD_INDEX 2 /* re-add existing indexes */
> +#define AT_PASS_OLD_CONSTR 3 /* re-add existing constraints */
> +#define AT_PASS_COL_ATTRS 4 /* set other column attributes */
> /* We could support a RENAME COLUMN pass here, but not currently used */
> -#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
> -#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
> -#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
> -#define AT_PASS_MISC 8 /* other stuff */
> -#define AT_NUM_PASSES 9
> +#define AT_PASS_ADD_COL 5 /* ADD COLUMN */
> +#define AT_PASS_ADD_INDEX 6 /* ADD indexes */
> +#define AT_PASS_ADD_CONSTR 7 /* ADD constraints, defaults */
> +#define AT_PASS_MISC 8 /* other stuff */
> +#define AT_PASS_SET_LOGGED_UNLOGGED 9 /* SET LOGGED and UNLOGGED */
> +#define AT_NUM_PASSES 10

This unnecessarily rewrites all the tabs, but see below.

> @@ -376,6 +377,7 @@ static void ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMOD
> static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId,
> char *cmd, List **wqueue, LOCKMODE lockmode,
> bool rewrite);
> +static void ATPostAlterSetLoggedUnlogged(Oid relid);

(See below)

> static void TryReuseIndex(Oid oldId, IndexStmt *stmt);
> static void TryReuseForeignKey(Oid oldId, Constraint *con);
> static void change_owner_fix_column_acls(Oid relationOid,
> @@ -402,6 +404,13 @@ static void ATExecAddOf(Relation rel, const TypeName *ofTypename, LOCKMODE lockm
> static void ATExecDropOf(Relation rel, LOCKMODE lockmode);
> static void ATExecReplicaIdentity(Relation rel, ReplicaIdentityStmt *stmt, LOCKMODE lockmode);
> static void ATExecGenericOptions(Relation rel, List *options);
> +static void ATPrepSetLogged(Relation rel);
> +static void ATPrepSetUnLogged(Relation rel);
> +static void ATExecSetLogged(Relation rel);
> +static void ATExecSetUnLogged(Relation rel);
> +
> +static void AlterTableSetLoggedCheckForeignConstraints(Relation rel);
> +static void AlterTableSetLoggedOrUnlogged(Relation rel, bool toLogged);

All that should be ordered like in the docs, i.e. after
ATExecDropCluster. (... and the SetTableSpace stuff is already here ...)

> + case AT_SetLogged:
> + case AT_SetUnLogged:
> + ATSimplePermissions(rel, ATT_TABLE);
> + if (cmd->subtype == AT_SetLogged)
> + ATPrepSetLogged(rel); /* SET LOGGED */
> + else
> + ATPrepSetUnLogged(rel); /* SET UNLOGGED */
> + pass = AT_PASS_SET_LOGGED_UNLOGGED;
> + break;

I'm wondering if you shouldn't make a single ATPrepSetLogged function
that takes and additional toLogged argument. Or alternatively get rid
of the if() by putting the code also into case AT_SetLogged.

> @@ -3307,6 +3327,9 @@ ATRewriteCatalogs(List **wqueue, LOCKMODE lockmode)
> ATPostAlterTypeCleanup(wqueue, tab, lockmode);
>
> relation_close(rel, NoLock);
> +
> + if (pass == AT_PASS_SET_LOGGED_UNLOGGED)
> + ATPostAlterSetLoggedUnlogged(RelationGetRelid(rel));

This must be done before relation_close() which releases all locks.

Moreover, I think you can get rid of that extra PASS here.
AT_PASS_ALTER_TYPE has its own pass because you can alter several
columns in a single ALTER TABLE statement, but you can have only one
SET (UN)LOGGED, so you can to the cluster_rel() directly in
AlterTableSetLoggedOrUnlogged() (unless cluster_rel() is too intrusive
and would interfere with other ALTER TABLE operations in this command,
no idea).

> }
> }
>
> @@ -3526,6 +3549,12 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel,
> case AT_GenericOptions:
> ATExecGenericOptions(rel, (List *) cmd->def);
> break;
> + case AT_SetLogged:
> + ATExecSetLogged(rel);
> + break;
> + case AT_SetUnLogged:
> + ATExecSetUnLogged(rel);
> + break;

I'd replace ATExecSetLogged and ATExecSetUnLogged directly with
AlterTableSetLoggedOrUnlogged(rel, true/false). The 1-line wrappers
don't buy you anything.

> +static void
> +AlterTableSetLoggedCheckForeignConstraints(Relation rel)
[...]

I can't comment on the quality of this function, but it seems to be
doing its job.

> +/*
> + * ALTER TABLE <name> SET UNLOGGED
> + *
> + * Change the table persistence type from permanent to unlogged by
> + * rewriting the entire contents of the table and associated indexes
> + * into new disk files.
> + *
> + * The ATPrepSetUnLogged function check all precondictions to perform
> + * the operation:
> + * - check if the target table is permanent
> + */
> +static void
> +ATPrepSetUnLogged(Relation rel)
> +{
> + /* check if is an permanent relation */
> + if (rel->rd_rel->relpersistence != RELPERSISTENCE_PERMANENT)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
> + errmsg("table %s is not permanent",
> + RelationGetRelationName(rel))));
> +}

Here's the big gotcha: Just like SET LOGGED must check for outgoing
FKs to unlogged tables, SET UNLOGGED must check for incoming FKs from
permanent tables. This is missing.

> +/*
> + * The AlterTableChangeCatalogToLoggedOrUnlogged function perform the
> + * catalog changes, i.e. update pg_class.relpersistence to 'p' or 'u'
> + */
> +static void
> +AlterTableChangeCatalogToLoggedOrUnlogged(Relation rel, Relation relrelation, bool toLogged)
> +{
> + HeapTuple tuple;
> + Form_pg_class pg_class_form;
> +
> + tuple = SearchSysCacheCopy1(RELOID,
> + ObjectIdGetDatum(RelationGetRelid(rel)));
> + if (!HeapTupleIsValid(tuple))
> + elog(ERROR, "cache lookup failed for relation %u",
> + RelationGetRelid(rel));
> +
> + pg_class_form = (Form_pg_class) GETSTRUCT(tuple);
> +
> + Assert(pg_class_form->relpersistence ==
> + ((toLogged) ? RELPERSISTENCE_UNLOGGED : RELPERSISTENCE_PERMANENT));
> +
> + pg_class_form->relpersistence = toLogged ?
> + RELPERSISTENCE_PERMANENT : RELPERSISTENCE_UNLOGGED;
> +
> + simple_heap_update(relrelation, &tuple->t_self, tuple);
> +
> + /* keep catalog indexes current */
> + CatalogUpdateIndexes(relrelation, tuple);
> +
> + heap_freetuple(tuple);
> +}

Again I can't comment on the low-level catalog stuff - though I'd
remove some of those blank lines.

> +/*
> + * The AlterTableSetLoggedOrUnlogged function contains the main logic
> + * of the operation, changing the catalog for main heap, toast and indexes
> + */
> +static void
> +AlterTableSetLoggedOrUnlogged(Relation rel, bool toLogged)
> +{
> + Oid relid;
> + Relation indexRelation;
> + ScanKeyData skey;
> + SysScanDesc scan;
> + HeapTuple indexTuple;
> + Relation relrel;
> +
> + relid = RelationGetRelid(rel);
> +
> + /* open pg_class to update relpersistence */
> + relrel = heap_open(RelationRelationId, RowExclusiveLock);
> +
> + /* main heap */
> + AlterTableChangeCatalogToLoggedOrUnlogged(rel, relrel, toLogged);
> +
> + /* toast heap, if any */
> + if (OidIsValid(rel->rd_rel->reltoastrelid))
> + {
> + Relation toastrel;
> +
> + toastrel = heap_open(rel->rd_rel->reltoastrelid, AccessShareLock);
> + AlterTableChangeCatalogToLoggedOrUnlogged(toastrel, relrel, toLogged);
> + heap_close(toastrel, AccessShareLock);

The comment on heap_open() suggests that you could directly invoke
relation_open() because you know this is a toast table, similarly for
index_open(). (I can't say which is better style.)

> + }
> +
> + /* Prepare to scan pg_index for entries having indrelid = this rel. */
> + indexRelation = heap_open(IndexRelationId, AccessShareLock);
> + ScanKeyInit(&skey,
> + Anum_pg_index_indrelid,
> + BTEqualStrategyNumber, F_OIDEQ,
> + relid);
> +
> + scan = systable_beginscan(indexRelation, IndexIndrelidIndexId, true,
> + NULL, 1, &skey);
> +
> + while (HeapTupleIsValid(indexTuple = systable_getnext(scan)))
> + {
> + Form_pg_index index = (Form_pg_index) GETSTRUCT(indexTuple);
> + Relation indxrel = index_open(index->indexrelid, AccessShareLock);
> +
> + AlterTableChangeCatalogToLoggedOrUnlogged(indxrel, relrel, toLogged);
> +
> + index_close(indxrel, AccessShareLock);
> + }

You forgot the TOAST index.

> diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
> index 605c9b4..a784d73 100644
> --- a/src/backend/parser/gram.y
> +++ b/src/backend/parser/gram.y

> @@ -2201,6 +2201,20 @@ alter_table_cmd:
> n->def = $3;
> $$ = (Node *)n;
> }
> + /* ALTER TABLE <name> SET LOGGED */
> + | SET LOGGED
> + {
> + AlterTableCmd *n = makeNode(AlterTableCmd);
> + n->subtype = AT_SetLogged;
> + $$ = (Node *)n;
> + }
> + /* ALTER TABLE <name> SET UNLOGGED */
> + | SET UNLOGGED
> + {
> + AlterTableCmd *n = makeNode(AlterTableCmd);
> + n->subtype = AT_SetUnLogged;
> + $$ = (Node *)n;
> + }

Also move up to match the docs/other code ordering.

> index ff126eb..dc9f8fa 100644
> --- a/src/include/nodes/parsenodes.h
> +++ b/src/include/nodes/parsenodes.h
> @@ -1331,7 +1331,9 @@ typedef enum AlterTableType
> AT_AddOf, /* OF <type_name> */
> AT_DropOf, /* NOT OF */
> AT_ReplicaIdentity, /* REPLICA IDENTITY */
> - AT_GenericOptions /* OPTIONS (...) */
> + AT_GenericOptions, /* OPTIONS (...) */
> + AT_SetLogged, /* SET LOGGED */
> + AT_SetUnLogged /* SET UNLOGGED */

Likewise.

> --- a/src/test/regress/expected/alter_table.out
> +++ b/src/test/regress/expected/alter_table.out

Please add test cases for incoming FKs, TOAST table relpersistence,
and TOAST index relpersistence.

Thanks for working on this feature, I'm looking forward to it!

Christoph
--
cb(at)df7cb(dot)de | http://www.df7cb.de/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2014-07-08 19:53:32 Re: tweaking NTUP_PER_BUCKET
Previous Message Tom Lane 2014-07-08 19:47:13 Re: LEFT JOINs not optimized away when not needed