Re: wal_dump output on CREATE DATABASE

From: Jean-Christophe Arnu <jcarnu(at)gmail(dot)com>
To: peter(dot)eisentraut(at)2ndquadrant(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: wal_dump output on CREATE DATABASE
Date: 2018-11-05 14:37:06
Message-ID: CAHZmTm3=8v+H75UGUO0R0N3arA520H0srS_hGVV8btnTUDKkkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le dim. 4 nov. 2018 à 18:01, Jean-Christophe Arnu <jcarnu(at)gmail(dot)com> a
écrit :

> Le ven. 2 nov. 2018 à 08:37, Peter Eisentraut <
> peter(dot)eisentraut(at)2ndquadrant(dot)com> a écrit :
>
>> On 26/10/2018 15:53, Jean-Christophe Arnu wrote:
>> > Exemple on CREATE DATABASE (without defining a template database) :
>> > rmgr: Database len (rec/tot): 42/ 42, tx: 568, lsn:
>> > 0/01865790, prev 0/01865720, desc: CREATE copy dir 1/1663 to 16384/1663
>> >
>> > It comes out (to me) it may be more consistent to use the same template
>> > than the other operations in pg_waldump.
>> > I propose to swap the parameters positions for the copy dir operation
>> > output.
>> >
>> > You'll find a patch file included which does the switching job :
>> > rmgr: Database len (rec/tot): 42/ 42, tx: 568, lsn:
>> > 0/01865790, prev 0/01865720, desc: CREATE copy dir 1663/1 to 1663/16384
>>
>> I see your point. I suspect this was mainly implemented that way
>> because that's how the fields occur in the underlying
>> xl_dbase_create_rec structure. (But that structure also has the target
>> location first, so it's not entirely consistent that way either.) We
>> could switch the fields around in the output, but I wonder whether we
>> couldn't make the whole thing a bit more readable like this:
>>
>> desc: CREATE copy dir ts=1663 db=1 to ts=1663 db=16384
>>
>> or maybe like this
>>
>> desc: CREATE copy dir (ts/db) 1663/1 to 1663/16384
>>
>
>
> Thank you Peter for your review and proposal .
> I like the last one which gives the fields semantics in a concise way.
> Pushing the idea a bit farther we could think of applying that description
> to any other operation that involves the ts/db/oid fields. What do you
> think ?
>
> Example in nbtdesc.c we could add the "(ts/db/oid)" information to help
> the DBA to identify the objects:
> case XLOG_BTREE_REUSE_PAGE:
> {
> xl_btree_reuse_page *xlrec = (xl_btree_reuse_page *) rec;
>
> appendStringInfo(buf, "rel (ts/db/rel) %u/%u/%u;
> latestRemovedXid %u",
> xlrec->node.spcNode, xlrec->node.dbNode,
> xlrec->node.relNode,
> xlrec->latestRemovedXid);
> break;
> }
>
> This may help DBAs to better identify the objects related to the messages.
>
> Any thought/comments/suggestions ?
>
> ~~~ Side story
> Well to be clear, my first proposal is born while i was writting a side
> script to textually identify which objects were involved into pg_waldump
> operations (if that objects already exists at script run time). I'm
> wondering whether it could be interesting to incllde such a "textual
> decoding" into pg_waldump or not (as a command line switch). Anyway, my
> script would be available as proof of concept first. We have time to
> discuss that last point in another thread.
> ~~~
>
> Thank you
>
>>
I've dug a little more in that way and spotted different locations in the
code where such semantics might be useful too.
Sometimes just like in XLOG_HEAP_TRUNCATE (heapdesc.c), there's a bunch of
rels and descendants that are held by a single db. Adding the database id
may be useful in that case. Decoding tablespace for each object may be
interesting (but not mandatory IMHO) for each rel. That information does
not seem to be included in the structure, but as newcomer I assume there's
a convenient way to retrieve that information easily.

Another operation that might be eligible to end user message improvements
is the one handled by xact_desc_commit() function (xactdesc.c) . Each time
a COMMIT (XLOG_XACT_COMMIT or XLOG_XACT_COMMIT_PREPARED) occurs the
folllowing snippets is output :

desc: COMMIT 2018-11-05 15:11:03.087546 CET; rels: base/16384/16385
base/16384/16388 base/16384/16390;

in that case, file path is output using relpathperm macro which ends up
callin gthe GetRelationPath() function. In that last function, we can have
the dbNode, spcNode (tablespace) and relNode Oid (include/common/relpath.h )

The question is now to know if it would be interesting to have a consistent
way to represent all objects and their hierarchy :
BTW, we could have db/ts/rel or ts/db/rel ?
What about the "base/ts/rel" from XLOG_XACT_COMMIT/COMMIT_PREPARED ? Why is
it different from the other representation (it must serve a purpose I
assume) ?
How do we represent multiples rels such as XLOG_HEAP_TRUNCATE ? My proposal
(db/rels) dboid/ reloid1 reloid2 reloid3 ... reloidN (as TRUNCATE only
deals with one DB, but no tablespace is defined, this may be another point
?)

Well, if you could give me some directions, I would appreciate !

As ever, any thought, comments are more than welcomed.

--
Jean-Christophe Arnu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-11-05 14:51:06 Re: zheap: a new storage format for PostgreSQL
Previous Message Peter Eisentraut 2018-11-05 14:08:33 Re: Reduce maintenance burden of alternative output files with \if \quit