Re: logical changeset generation v6.7

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: logical changeset generation v6.7
Date: 2013-12-05 16:45:51
Message-ID: 20131205164551.GD3866@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2013-12-05 22:03:51 +0900, Kyotaro HORIGUCHI wrote:
> > > - You assined HeapTupleGetOid(tuple) into relid to read in
> > > several points but no modification. Nevertheless, you left
> > > HeapTupleGetOid not replaced there. I think 'relid' just for
> > > repeated reading has far small merit compared to demerit of
> > > lowering readability. You'd be better to make them uniform in
> > > either way.
> >
> > It's primarily to get the line lengths halfway under control.
>
> Mm. I'm afraid I couldn't catch your words, do you mean that
> IsSystemClass() or isTempNamespace() could change the NULL bitmap
> in the tuple?

Huh? No. I just meant that the source code lines are longer if you use
"HeapTupleGetOid(tuple)" instead of just "relid". Anway, that patch has
since been committed...

> > > ===== 0002:
> > >
> > > - You are identifying the wal_level with the expr 'wal_level >=
> > > WAL_LEVEL_LOGICAL' but it seems somewhat improper.
> >
> > Hm. Why?
>
> It actually does no harm and somewhat trifling so I don't assert
> you should fix it.
>
> The reason for the comment is the greater values for wal_level
> are undefined at the moment, so strictly saying, such values
> should be handled as invalid ones.

Note that other checks for wal_level are written the same way. Consider
how much bigger this patch would be if every usage of wal_level would
need to get changed because a new level had been added.

> > > - RelationIsAccessibleInLogicalDecoding and
> > > RelationIsLogicallyLogged are identical in code. Together with
> > > the name ambiguity, this is quite confising and cause of
> > > future misuse between these macros, I suppose. Plus the names
> > > seem too long.
> >
> > Hm, don't think they are equivalent, rather the contrary. Note one
> > returns false if IsCatalogRelation(), the other true.
>
> Oops, I'm sorry. I understand they are not same. Then I have
> other questions. The name for the first one
> 'RelationIsAccessibleInLogicalDecoding' doesn't seem representing
> what its comment reads.
>
> | /* True if we need to log enough information to have access via
> | decoding snapshot. */
>
> Making the macro name for this comment directly, I suppose it
> would be something like 'NeedsAdditionalInfoInLogicalDecoding' or
> more directly 'LogicalDeodingNeedsCids' or so..

The comment talks about logging enough information that it is accessible
- just as the name.

> > > - In heap_insert, the information conveyed on rdata[3] seems to
> > > be better to be in rdata[2] because of the scarecity of the
> > > additional information. XLOG_HEAP_CONTAINS_NEW_TUPLE only
> > > seems to be needed. Also is in heap_multi_insert and
> > > heap_upate.
> >
> > Could you explain a bit more what you mean by that? The reason it's a
> > separate rdata entry is that otherwise a full page write will remove the
> > information.
>
> Sorry, I missed the comment 'so that an eventual FPW doesn't
> remove the tuple's data'. Although given the necessity of removal
> prevention, rewriting rdata[].buffer which is required by design
> (correct?) with InvalidBuffer seems a bit outrageous for me and
> obfuscating the objective of it.

Well, it's added in a separate rdata element. Just as in dozens of other
places.

> > > - In heap_delete, at the point adding replica identity, same to
> > > the aboves, rdata[3] could be moved into rdata[2] making new
> > > type like 'xl_heap_replident'.
> >
> > Hm. I don't think that'd be a good idea, because we'd then need special
> > case decoding code for deletes because the wal format would be different
> > for inserts/updates and deletes.
>
> Hmm. Although one common xl_heap_replident is impractical,
> splitting a logcally single entity into two or more XLogRecDatas
> also seems not to be a good idea.

That's done everywhere. There's basically two reasons to use separate
rdata elements:
* the buffers are different
* the data pointer is different

The rdata chain elements don't survive in the WAL.

> Considering above, looking heap_xlog_insert(), you marked on
> xlrec.flags with XLOG_HEAP_CONTAINS_NEW_TUPLE to signal decoder
> that the record should have tuple data not being removed by fpw.
> This is the same for the update record. So the redoer(?) also can
> distinguish whether the update record contains extra tuple data
> or not.

But it doesn't know the length of the individual records, so knowing
there are two doesn't help.

> On the other hand, the update record after patched is longer by
> sizeof(uint16) regardless of whether 'tuple data' is attached or
> not. I don't know the value of the size in WAL stream, but the
> smaller would be better maybe.

I think that'd make things too complicated without too much gain in
comparison to the savings.

> # Of course, it doesn't matter if the object for OLD can
> # naturally be missing as English.

Well, I think the context makes it clear enough.

> I'm not a native English speaker as you know:-)

Neither am I ;).

> undefining XLOG_HEAP_CONTAINS_OLD and use separte macros type 1
> | if (xlrec->flags & XLOG_HEAP_CONTAINS_OLD_KEY ||
> | xlrec->flags & XLOG_HEAP_CONTAINS_OLD_TUPLE)
> | {
> (I belive this should be optimized by the compiler:-)

It's not about efficiency, imo the other variant looks clearer. And will
continue to work if we add the option to selectively log columns or
such.

> if (RelationIsAccessibleInLogicalDecoding(relation))
> + {
> | rdata_heap_new_cid(&rdata[0], relation, heaptup);
> + xlrec.flags |= XLOG_HEAP_CONTAINS_NEW_CID;
> + }
> + else
> + rdata_void(&rdata[0])
> + rdata[0].next = &(rdata[1]);
>
> xlrec.flags = all_visible_cleared ? XLOG_HEAP_ALL_VISIBLE_CLEARED : 0;
> xlrec.target.node = relation->rd_node;
> xlrec.target.tid = heaptup->t_self;
> | rdata[1].data = (char *) &xlrec;
>
> If you don't agree with this, I don't say no more about this.

It's a lot more complex than that. This throws of all kind of size
calculations. and it makes the redo functions more complex - and they
are much more often executed for !CONTAINS_NEW_CID.

> > > - The macro name INDEX_ATTR_BITMAP_KEY should be
> > > INDEX_ATTR_BITMAP_FKEY. And INDEX_ATTR_BITMAP_IDENTITY_KEY
> > > should be INDEX_ATTR_BITMAP_REPLID_KEY or something.
> >
> > But INDEX_ATTR_BITMAP_KEY isn't just about foreign keys... But I agree
> > that INDEX_ATTR_BITMAP_IDENTITY_KEY should be renamed.
>
> Thank you, please remember to do that.

I am not sure it's a good idea anymore, it doesn't really seem to
increase clarity...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2013-12-05 16:51:49 Re: Extension Templates S03E11
Previous Message Peter Eisentraut 2013-12-05 16:39:29 Re: Proof of concept: standalone backend with full FE/BE protocol