Re: patch proposal

From: Venkata B Nagothi <nag1010(at)gmail(dot)com>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch proposal
Date: 2017-01-27 08:19:04
Message-ID: CAEyp7J9C2t60nS74FFQnRw9CLxxKUAN=Za1RNrjTLH4xvCUqrQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi David,

On Tue, Jan 24, 2017 at 9:22 AM, David Steele <david(at)pgmasters(dot)net> wrote:

> Hi Venkata,
>
> On 11/8/16 5:47 PM, Venkata B Nagothi wrote:
> > Attached is the 2nd version of the patch with some enhancements.
>
> Here's my review of the patch.
>

Thank you very much for reviewing the patch.

1) There are a number of whitespace error when applying the patch:
>
> $ git apply ../other/recoveryTargetIncomplete.patch-version2
> ../other/recoveryTargetIncomplete.patch-version2:158: indent with spaces.
> else
> ../other/recoveryTargetIncomplete.patch-version2:160: space before tab
> in indent.
> ereport(LOG,
> ../other/recoveryTargetIncomplete.patch-version2:166: indent with spaces.
> /*
> ../other/recoveryTargetIncomplete.patch-version2:167: indent with spaces.
> * This is the position where we can
> choose to shutdown, pause
> ../other/recoveryTargetIncomplete.patch-version2:168: indent with spaces.
> * or promote at the end-of-the-wal if the
> intended recovery
> warning: squelched 10 whitespace errors
> warning: 15 lines add whitespace errors.
>
> The build did succeed after applying the patch.
>

Sorry, my bad. The attached patch should fix those errors.

> 2) There is no documentation. Serious changes to recovery are going to
> need to be documented very carefully. It will also help during review
> to determine you intent.
>

Attached patches include the documentation as well.

>
> 3) There are no tests. I believe that
> src/test/recovery/t/006_logical_decoding.pl would be the best place to
> insert your tests.
>

I will be adding the tests in src/test/recovery/t/003_recovery_targets.pl.
My tests would look more or less similar to recovery_target_action. I do
not see any of the tests added for the parameter recovery_target_action ?
Anyways, i will work on adding tests for recovery_target_incomplete.

> 4) I'm not convinced that fatal errors by default are the way to go.
> It's a pretty big departure from what we have now.
>

I have changed the code to generate ERRORs instead of FATALs. Which is in
the patch recoveryStartPoint.patch

> Also, I don't think it's very intuitive how recovery_target_incomplete
> works. For instance, if I set recovery_target_incomplete=promote and
> set recovery_target_time, but pick a backup after the specified time,
> then there will be an error "recovery_target_time %s is older..." rather
> than a promotion.
>

Yes, that is correct and that is the expected behaviour. Let me explain -

a) When you use the parameter recovery_target_time and pick-up a backup
taken after the specified target time, then, recovery will not proceed
further and ideally it should not proceed further.
This is not an incomplete recovery situation either. This is a
situation where recovery process cannot start at all. With this patch,
"recovery_target_time" (similarly recovery_target_xid and
recovery_target_lsn) first checks if the specified target time is later
than the backup's time stamp if yes, then the recovery will proceed a-head
or else it would generate an error,
which is what has happened in your case. The ideal way to deal with
this situation to change the recovery_target_time to a much later time
(which is later than the backup's time)
or use the parameter "recovery_target=immediate" to complete the
recovery and start the database or choose the correct backup (which has a
time stamp prior to the recovery_target_time).

b) recovery_target_incomplete has no effect in the above situation as this
parameter only deals with incomplete recovery. A situation where-in
recovery process stops mid-way (Example : due to missing or corrupt
WALs ) without reaching the intended recovery target (XID, Time and LSN).

c) In real-time environment when a DBA is required to perform PITR to a
particular recovery target (say, recovery_target_time),
it is the DBA who knows till where the database needs to be recovered
and what backup needs to be used for the same. The first thing DBA would do
is, choose the
appropriate backup which is taken prior to the recovery target. This is
the basic first step needed.

It seems to me that there needs to be a separate setting to raise a
> fatal error.
>
> 5) There are really two patches here. One deals with notifying the user
> that replay to their recovery target is not possible (implemented here
> with fatal errors) and the other allows options when their recovery
> target appears to be possible but never arrives.

As far as I can see, at this point, nobody has really signed on to this
> approach. I believe you will need a consensus before you can proceed
> with a patch this disruptive.
>
> A better approach may be to pull the first part out and raise warnings
> instead. This will allow you to write tests and make sure that your
> logic is correct without major behavioral changes to Postgres.
>

This patch is to ensure that the recovery starts only if the specified
recovery target is legitimate and if not, recovery should error out without
replaying any WALs (This was the agreed approach).
"Warnings" may not help much. I have split the patch into two different
pieces. One is to determine if the recovery can start at all and other
patch deals with the incomplete recovery situation.

Please let me know if you have any further comments.

Regards,
Venkata B N

Database Consultant

Attachment Content-Type Size
recoveryStartPoint.patch application/octet-stream 6.9 KB
recoveryTargetIncomplete.patch-version3 application/octet-stream 7.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-01-27 08:33:57 Re: Radix tree for character conversion
Previous Message Kyotaro HORIGUCHI 2017-01-27 07:36:03 Re: BUG: pg_stat_statements query normalization issues with combined queries