Re: WAL format and API changes (9.5)

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: WAL format and API changes (9.5)
Date: 2014-11-13 13:33:44
Message-ID: 5464B338.8070805@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/11/2014 04:42 PM, Amit Kapila wrote:
> I have done some performance testing of this patch using attached
> script and data is as below:
>
> ...
>
> It seems to me that there is a regression of (4 ~ 8%) for small records,
> refer two short fields tests.

Thanks for the testing!

Here's a new version, with big changes again to the record format. Have
a look at xlogrecord.h for the details, but in a nutshell:

1. The overall format is now: XLogRecord, per-block headers, header for
main data portion, per-block data, main data.

2. I removed xl_len field from XLogRecord and rearranged the fields, to
shrink the XLogRecord struct from 32 to 24 bytes. (instead, there's a
new 2- or 5-byte header for the "main data", after the block headers).

3. No alignment padding. (the data chunks are copied to aligned buffers
at replay, so redo functions can still assume aligned access)

In quick testing, this new WAL format is somewhat more compact than the
9.4 format. That also seems to have more than bought back the
performance regression I saw earlier. Here are results from my laptop,
using the wal-update-testsuite.sh script:

master:

testname | wal_generated | duration

-----------------------------------------+---------------+------------------
two short fields, no change | 396982984 | 7.73713994026184
two short fields, no change | 398531152 | 7.72360110282898
two short fields, no change | 397228552 | 7.90237998962402
two short fields, one changed | 437108464 | 8.03014206886292
two short fields, one changed | 438368456 | 8.17672896385193
two short fields, one changed | 437105232 | 7.89896702766418
two short fields, both changed | 437100544 | 7.98763203620911
two short fields, both changed | 437107032 | 8.0971851348877
two short fields, both changed | 437105368 | 8.1279079914093
one short and one long field, no change | 76552752 | 2.47367906570435
one short and one long field, no change | 76043608 | 2.54243588447571
one short and one long field, no change | 76042576 | 2.6014678478241
ten tiny fields, all changed | 477221488 | 9.41646003723145
ten tiny fields, all changed | 477224080 | 9.37260103225708
ten tiny fields, all changed | 477220944 | 9.41951704025269
hundred tiny fields, all changed | 180889992 | 4.72576093673706
hundred tiny fields, all changed | 180348224 | 4.50496411323547
hundred tiny fields, all changed | 181347504 | 4.78004717826843
hundred tiny fields, half changed | 180379760 | 4.53589606285095
hundred tiny fields, half changed | 181773832 | 4.85075807571411
hundred tiny fields, half changed | 180348160 | 4.65349197387695
hundred tiny fields, half nulled | 100114832 | 3.70726609230042
hundred tiny fields, half nulled | 100116840 | 3.88224697113037
hundred tiny fields, half nulled | 100118848 | 4.00612688064575
9 short and 1 long, short changed | 108140640 | 2.63146805763245
9 short and 1 long, short changed | 108508784 | 2.76349496841431
9 short and 1 long, short changed | 108137144 | 2.79056811332703
(27 rows)

wal-format-and-api-changes-9.patch:

testname | wal_generated | duration

-----------------------------------------+---------------+------------------
two short fields, no change | 356865216 | 6.81889986991882
two short fields, no change | 356871304 | 7.0333080291748
two short fields, no change | 356869520 | 6.62423706054688
two short fields, one changed | 356867824 | 7.09969711303711
two short fields, one changed | 356866480 | 7.07576990127563
two short fields, one changed | 357987080 | 7.25394797325134
two short fields, both changed | 396996096 | 7.13484597206116
two short fields, both changed | 396990184 | 7.08063006401062
two short fields, both changed | 396987192 | 7.04641604423523
one short and one long field, no change | 70858376 | 2.2726149559021
one short and one long field, no change | 68024232 | 2.21982789039612
one short and one long field, no change | 69258192 | 2.4696249961853
ten tiny fields, all changed | 396987896 | 8.25723004341125
ten tiny fields, all changed | 396983768 | 8.24221706390381
ten tiny fields, all changed | 397012600 | 8.60816693305969
hundred tiny fields, all changed | 172327416 | 4.57576704025269
hundred tiny fields, all changed | 174669320 | 4.52080512046814
hundred tiny fields, all changed | 172696944 | 4.65672993659973
hundred tiny fields, half changed | 172323720 | 4.57278800010681
hundred tiny fields, half changed | 172330232 | 4.63164114952087
hundred tiny fields, half changed | 172326864 | 4.74219608306885
hundred tiny fields, half nulled | 85597408 | 3.78670310974121
hundred tiny fields, half nulled | 84742808 | 3.82968688011169
hundred tiny fields, half nulled | 84066936 | 3.86192607879639
9 short and 1 long, short changed | 100113080 | 2.54274320602417
9 short and 1 long, short changed | 100119440 | 2.4966151714325
9 short and 1 long, short changed | 100115960 | 2.63230085372925
(27 rows)

Aside from the WAL record format changes, this patch adds the "decoded
WAL record" infrastructure that we talked about with Andres. XLogReader
now has a new function, DecodeXLogRecord, which parses the block headers
etc. from the WAL record, and copies the data chunks to aligned buffers.
The redo routines are passed a pointer to the XLogReaderState, instead
of the plain XLogRecord, and the redo routines can use macros and
functions defined xlogreader.h to access the already-decoded WAL record.
The new WAL record format is difficult to parse in a piece-meal fashion,
so it really needs this separate decoding pass to be efficient.

Thoughts on this new WAL record format? I've attached the xlogrecord.h
file here separately for easy reading, if you want to take a quick look
at just that without applying the whole patch.

- Heikki

Attachment Content-Type Size
wal-format-and-api-changes-9.patch.gz application/gzip 107.8 KB
xlogrecord.h text/x-chdr 6.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2014-11-13 13:41:36 Re: tracking commit timestamps
Previous Message Simon Riggs 2014-11-13 13:18:22 Re: tracking commit timestamps