Re: [PATCH] Make pg_basebackup configure and start standby [Review]

From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Make pg_basebackup configure and start standby [Review]
Date: 2012-11-18 19:24:25
Message-ID: 50A935E9.6090602@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

2012-11-18 17:20 keltezéssel, Magnus Hagander írta:
> On Tue, Oct 23, 2012 at 5:08 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>> On Oct 23, 2012 4:52 PM, "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com> wrote:
>>> Boszormenyi Zoltan escribió:
>>>
>>>>>> Also, the check for conflict between -R and -x/-X is now removed.
>>>> The documentation for option -R has changed to reflect this and
>>>> there is no different error code 2 now: it would make the behaviour
>>>> inconsistent between -Fp and -Ft.
>>>>
>>>>>>> The PQconninfo patch is also attached but didn't change since the
>>>>>>> last mail.
>>> Magnus,
>>>
>>> This patch is all yours to handle. I'm guessing nothing will happen
>>> until pgconf.eu is done and over, but hopefully you can share a few
>>> beers with Zoltan over the whole subject (and maybe with Peter about the
>>> PQconninfo stuff?)
>>>
>>> I'm not closing this just yet, but if you're not able to handle this
>>> soon, maybe it'd be better to punt it to the November commitfest.
>> It's on my to do list for when I get back, but correct, won't get to it
>> until after the conference.
> I finally got around to looking at this patch now. Sorry about the way
> too long delay.

No problem, thanks for looking at it.

> A few thoughts:
>
> First, on the libpq patch:
>
> I'm not sure I like the for_replication flag to PQconninfo(). It seems
> we're it's a quite limited API, and not very future proof. What's to
> say that an app would only be interested in filtering based on
> for_replication? One idea might be to have a bitmap field there, and
> assign *all* conninfo options to a category. We could then have
> categories for NORMAL and REPLICATION. But we could also for example
> have a category for PASSWORD (or similar), so that you could get with
> and without those. Passing in a 32-bit integer would allow us to have
> 32 different categories, and we could then use a bitmask to pick
> things out.
>
> It might sound a bit like overengineering, but it's also an API and
> it's a PITA to change it in the future if more needs show up..

You are right, I will change this accordingly.

> Second, I wonder if we really need to add the code for requiressl=1,
> or if we should just remove it. The spelling requiressl=1 was
> deprecated back in 7.4 - which has obviously been out of support for a
> long time now.

This needs opinions from more people, I am not the one to decide it.
The code would be definitely cleaner without processing this extra
non-1:1 mapping.

> Third, in fillPGconn. If mapping has both conninfoValue and connvalue,
> it does a free() on the old value in memberaddr, but if it doesn't it
> just overwrites memberaddr with strdup(). Shouldn't those two paths be
> the same, meaning shouldn't the if (*memberaddr) free(*memberaddr);
> check be outside the if block?

Yes, and set it to NULL too. Otherwise there might be a case when
the free() leaves a stale pointer value if the extra mapping fails
the strcmp() check. This is all unnecessary if the extra mapping
for requiressl=1 is removed, the code would be straight.

> Fourth, I'm not sure I like the "memberaddr" term. It's not wrong, but
> it threw me off a couple of times while reading it. It's not all that
> important, and I'm not sure about another idea for it though - maybe
> just "connmember"?

I am not attached to this variable name, I will change it.

> Then, about the pg_basebackup patch:
>
> What's the reason for the () around 512 for TARCHUNKSZ?

It's simply a habit, to not forget it for more complex macros.

> We have a lot of hardcoded entries of the 512 for tar in that file. We
> should either keep using 512 as a constant, or change all those to
> *also* use the #define. Right now, the code will obviously break if
> you change the #define (for example, it compares to 512, but then uses
> hdrleft = TARCHUNKSZ - tarhdrsz; to do calculation).

Yes, I left 5 pieces of the hardcoded value of 512, because
I (maybe erroneously) distinguished between a file header
and file "chunks" inside a TAR file, they are all 512.

Is it okay to change every hardcoded 512 to TARCHUNKSZ,
maybe adding a comment to the #define that it must not
be modified ever?

> The name choice of "basebackup" for the bool in ReceiveTarFile() is
> unfortunate, since we use the term base backup to refer to the
> complete thing, not just the main tablespace. Probably
> "basetablespcae" instead. And it should then be assigned at the top of
> the function (to the result of PQgetisnull()), and used in the main
> if() branch as well.

Will change it.

> Should we really print the status message even if not in verbose mode?
> We only print the "base backup complete" messages in verbose mode, for
> example.

OK.

> It seems wrong to free() recoveryconf in ReceiveTarFile(). It's
> allocated globally at the beginning. While that loop should only be
> called once (since only one tablespace can be the main one), it's a
> confusing location for the free.
>
> The whole tar writing part of the code needs a lot more comments. It's
> entirely unclear what the code does there. Why does the recovery.conf
> writing code need to be split up in multiple locations inside
> ReceiveTarFile(), for example? It either needs to be simplified, or
> explained why it can't be simplified in comments.
>
> _tarCreateHeader() is really badly named, since it specifically
> creates a tar header for recovery.conf only. Either that needs to be a
> parameter, or it needs to have a name that indicates this is the only
> thing it does. The former is probably better.
>
> Much of the tar stuff is very similar (I haven't looked to see if it's
> identical) to the stuff in backend/replication/basebackup.c. Perhaps
> we should have a src/port/tarutil.c?

I copied the tar stuff from bin/pg_dump/pg_backup_tar.c,
so there are at least two copies of this already.
I will look into unifying them.

> escape_string() - already exists as escape_quotes() in initdb, AFAICT.
> We should either move it to src/port/, or at least copy it so it's
> exactly the same.

OK. I can copy it, too. ;-)

> CreateRecoveryConf() should just use PQExpBuffer (in libpq), I think -
> that does away with a lot of code. We already use this from e.g.
> pg_dump, so there's a precedent for using internal code from libpq in
> frontends.

OK.

> Again, my apologies for this review taking so long. I will try to be
> more attentive to the next round :S

No problem. I will try to update the patches according to
your comments as soon as possible.

Thanks and best regards,
Zoltán Böszörményi

>
> --
> Magnus Hagander
> Me: http://www.hagander.net/
> Work: http://www.redpill-linpro.com/
>
>

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-11-18 19:57:51 Avoiding overflow in timeout-related calculations
Previous Message Jeff Davis 2012-11-18 17:13:19 Re: Do we need so many hint bits?