Re: pg_basebackup and wal streaming

From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup and wal streaming
Date: 2011-02-20 20:37:42
Message-ID: m2d3mmh27d.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Magnus Hagander <magnus(at)hagander(dot)net> writes:
> Better late than never (or?), here's the final cleanup of
> pg_streamrecv for moving into the main distribution, per discussion
> back in late dec or early jan. It also includes the "stream logs in
> parallel to backup" part that was not completed on pg_basebackup.

And that's something I've been so interested in! It's only fair game
that I spend time reviewing after my insisting for having it :)

The programs (pg_basebackup and pg_receivexlog) both work as expected,
and show in the pg_stat_replication system view.

pg_basebackup -x option should be revised so that it's easier to make
the difference between streaming WAL while the base backup is ongoing
and fetching them at the end, taking the risk to corrupt the whole
backup as soon as wal_keep_segments is undersized.

-x, --xlog[=stream] include required WAL files in backup

It could be --xlog=stream|fetch or something that reads better.

The sent patch includes a binary called pg_receivexlog\(X\), but Magnus
told me on IRC that this is fixed already in his branch (a missing $ at
several places in the Makefile).

Now, on to the code itself.

I wonder if the segment_callback() routine would better be a while{}
loop rather than a recursive construct. Also, it looks like a lib
function but it's doing exit(1)…

Unfortunately I can't comment (or won't risk learning enough details
tonight to try to be smart here) on FindStreamingStart() implementation,
that seems crucial.

> Other than that, the only changes to pg_basebackup are the moving of a
> couple of functions into streamutil.c to make them usable from both,
> and the progress format output fix Fujii-san mentioned recently.

Check.

> Should be complete except for Win32 support (needs thread/fork thing
> for the background streaming feature. Shouldn't be too hard, and I
> guess that falls on me anyway..) and the reference documentation.

Yeah, StartLogStreamer() is still using fork() at the moment, I guess
you will have to change that prior to commit. Maybe you can reuse the
code found in src/bin/pg_dump/pg_backup_archiver.c, spawn_restore.

I can't wait to see the new streaming replication setup docs using that
integrated tool. Even if baring another step here, we still need to
rely on wal_keep_segments (for switching from the base backup to the
live standby), the critical window is so much reduced… and it's now
possible to prepare the standby using a single integrated command line.

Will the server refrain from recycling a WAL file when all receivers
sent_location are not known to be past the positions contained in it?
If that's the case, the documentation should talk about pg_receivexlog
as an alternative to archiving, relying on libpq. It that's not the
case, is there a good reason for that not being the case? (even if
that's not on this patch to fix that).

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-02-20 21:09:11 Re: using a lot of maintenance_work_mem
Previous Message Tom Lane 2011-02-20 19:56:00 Re: Fix for fuzzystrmatch