Re: WAL format and API changes (9.5)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: WAL format and API changes (9.5)
Date: 2014-10-30 16:02:07
Message-ID: 20141030160207.GA12193@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-10-29 10:24:20 +0200, Heikki Linnakangas wrote:
> On 10/06/2014 02:29 PM, Andres Freund wrote:
> >On 2014-10-06 14:19:39 +0300, Heikki Linnakangas wrote:
> >>Barring objections, I'll commit this, and then continue benchmarking the
> >>second patch with the WAL format and API changes.
> >
> >I'd like to have a look at it beforehand.
>
> Ping? Here's an rebased patch. I'd like to proceed with this.

Doing so.

> >I've not yet really looked,
> >but on a quick readthrough XLogInsertRecData() staying in xlog.c doesn't
> >make me happy...
>
> Ok.. Can you elaborate?

To me the split between xloginsert.c doing some of the record assembly,
and xlog.c doing the lower level part of the assembly is just wrong.

And it leads to things like the already complicated logic to retry after
detecting missing fpws is now split across two files seems to confirm
that. What happens right now is that XLogInsert() (with some helper
routines) assembles the record. Then hands that off to
XLogInsertRecData(). Which sometimes returns InvalidXLogRecPtr, and
returns back to XLogInsert(), which reverses *some* of its work and then
retries. Brr.

Similary the 'page_writes_omitted' logic doesn't make me particularly
happy. Previously we retried when there actually was a page affected by
the different RedoRecPtr. Now we do it as soon as our copy of RedoRecPtr
is out of date? Won't that quite often spuriously trigger retries? Am I
missing something?
Arguably this doesn't happen often enough to matter, but it's still
something that we should explicitly discuss.

The implementation of the split seems to change the meaning of
TRACE_POSTGRESQL_XLOG_INSERT - it's now counting retries. I don't
particularly care about those tracepoints, but I see no simplification
due to changing its meaning.

Aside: In C11 function local statics become a bit more expensive - they
essentially require atomics and/or spinlocks on the first few calls.

Next thing: The patch doesn't actually compile. Misses an #include
storage/proc.h for MyPgXact.

I'm not a big fan of the naming for the new split. We have
* XLogInsert() which is the external interface
* XLogRecordAssemble() which builds the rdata chain that will be
inserted
* XLogInsertRecData() which actually inserts the data into the xl buffers.

To me that's pretty inconsistent.

I'm also somewhat confused about the new split of the headers. I'm not
adverse to splitting them, but it's getting a bit hard to understand:
* xlog.h - generic mishmash
* xlogdefs.h - Three typedefs and some #defines
* xlog_fn.h - SQL functions
* xlog_internal.h - Pretty random collection of stuff
* xlogutils.h - "Utilities for replaying WAL records"
* xlogreader.h - "generic XLog reading facility"
* xloginsert.h - "Functions for generating WAL records"
* xlogrecord.h - "Definitions for the WAL record format."

Isn't that a bit too much? Pretty much the only files that have a
recognizable separation to me are xlog_fn.h and xlogreader.h.

You've removed the
- *
- * NB: this routine feels free to scribble on the XLogRecData structs,
- * though not on the data they reference. This is OK since the
XLogRecData
- * structs are always just temporaries in the calling code.
comment, but we still do, no?

While not this patches blame, I see currently we finish the critical
section in XLogInsert/XLogInsertRecData pretty early:
END_CRIT_SECTION();

/*
* Update shared LogwrtRqst.Write, if we crossed page boundary.
*/
if (StartPos / XLOG_BLCKSZ != EndPos / XLOG_BLCKSZ)
{
SpinLockAcquire(&XLogCtl->info_lck);
/* advance global request to include new block(s) */
if (XLogCtl->LogwrtRqst.Write < EndPos)
XLogCtl->LogwrtRqst.Write = EndPos;
/* update local result copy while I have the chance */
LogwrtResult = XLogCtl->LogwrtResult;
SpinLockRelease(&XLogCtl->info_lck);
}
I don't think this is particularly critical, but it still looks wrong to me.

Hm. I think that's what I see for now.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-10-30 17:27:25 Re: TAP test breakage on MacOS X
Previous Message Andres Freund 2014-10-30 15:54:25 Re: TAP test breakage on MacOS X