Re: Hot Standby, release candidate?

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hot Standby, release candidate?
Date: 2009-12-14 11:07:32
Message-ID: 1260788852.1955.580.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the further review, much appreciated.

On Mon, 2009-12-14 at 11:54 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > Enclose latest version of Hot Standby. This is the "basic" patch, with
> > no must-fix issues and no known bugs. Further additions will follow
> > after commit, primarily around usability, which will include additional
> > control functions for use in testing. Various thoughts discussed here
> > http://wiki.postgresql.org/wiki/Hot_Standby_TODO
>
> I still consider it highly important to be able to start standby from a
> shutdown checkpoint. If you remove it, at the very least put it back on
> the TODO.

Happy to put it back on TODO, but I'm not likely to do this without a
good reason. IMHO your arguments as to why that is useful were not
convincing, especially when it introduces further complications and
requirements for tests.

> But as it is, StandbyRecoverPreparedTransactions() is dead code, and the
> changes to PrescanPreparedTransactions() are not needed either.
>
> > Patch now includes a set of regression tests that can be run against a
> > standby server using "make standbycheck"
>
> Nice!
>
> > (requires setup, see src/test/regress/standby_schedule).
>
> Any chance of automating that?

Future, yes.

I view standbycheck as similar to installcheck - it requires some setup
before it can run, so allows you to test an existing server. I see the
same need here.

> > Complete with full docs and comments.
> >
> > Barring resolving a few points and subject to even more testing, this is
> > the version I expect to commit to CVS on Wednesday. I would appreciate
> > further objective testing before commit, if possible.
>
> * Please remove any spurious whitespace. "git diff --color" makes them
> stand out like a sore thumb, in red. (pgindent will fix them but always
> better to fix them before committing, IMO).

What is your definition of spurious whitespace? I removed all
additions/deletions of individual lines. git diff colours many things
red, and it seems like a waste of time to spend hours fiddling with
spaces manually if there is a utility that does this anyway.

> * vacuum_defer_cleanup_age is very hard to tune. You'll need an estimate
> on your transaction rate to begin with. Do we really want this setting
> in its current form? Does it make sense as PGC_USERSET, as if one
> backend uses a lower setting than others, that's the one that really
> determines when transactions are killed in the standby? I think this
> should be discussed and implemented as a separate patch.

All the vacuum_*_age parameters have this characteristic, yet they
exist.

I would like to provide simple features for conflict avoidance in the
first alpha release. If we find that nobody found it useful then it can
be removed easily enough.

It's a USERSET now, but it could also be other things. This patch isn't
the end of discussion, for many people it will be the start.

> * Are you planning to remove the recovery_connections setting before
> release? The documentation makes it sound like it's a temporary hack
> that we're not really sure is needed at all. That's not very comforting.

It has been requested and I agree, so its there. Saying it might be
removed in future is no more than we do elsewhere and AFAIK we all hope
it will be. Not sure why that is or isn't comforting.

> * You removed this comment from KnownAssignedXidsInit:
>
> - /*
> - * XXX: We should check that we don't exceed maxKnownAssignedXids.
> - * Even though the hash table might hold a few more entries than that,
> - * we use fixed-size arrays of that size elsewhere and expected all
> - * entries in the hash table to fit.
> - */
>
> but AFAICS you didn't address the issue. It's referring to the 'xids'
> array in TransactionIdIsInProgress(), which KnownAssignedXidsGet() fills
> in without checking that it fits.

I have ensured that they are always the same size, by definition, so no
need to check.

> * LockAcquireExtended needs a function comment. Or at least something
> explaining what report_memory_error does. And perhaps the argument
> should be named "reportMemoryError" to be in line with the other arguments.

OK

> * We tend to not add remarks about authors in code (comments in standby.c).

OK

> * This optimization in GetConflictingVirtualXIDs():
>
> > + /*
> > + * If we don't know the TransactionId that created the conflict, set
> > + * it to latestCompletedXid which is the latest possible value.
> > + */
> > + if (!TransactionIdIsValid(limitXmin))
> > + limitXmin = ShmemVariableCache->latestCompletedXid;
> > +
>
> needs a lot more explanation. It took me a very long time to figure out
> why using latest completed xid is safe.

OK. Took me a long time as well.

> * Are you going to leave the trace_recovery GUC in?

For now, at least. I have a later proposal around that to follow.

> * Can you merge with CVS HEAD, please? There's some merge conflicts.

Huh? I did. And tested patch against a CVS checkout before submitting.
Can you explain further?

> > Last remaining points
> >
> > * NonTransactionalInvalidation logging has been removed following
> > review, but AFAICS that means VACUUM FULL doesn't work correctly on
> > catalog tables, which regrettably will be the only ones still standing
> > even after we apply VFI patch. Did I misunderstand the original intent?
> > Was it just buggy somehow? Or is this hoping VF goes completely, which
> > seems unlikely in this release. Just noticed this, decided better to get
> > stuff out there now.
>
> I removed it in the hope that VF is gone before beta.

OK

> > * Are we OK with using hash indexes in standby plans, even when we know
> > the indexes are stale and could return incorrect results?
>
> It doesn't seem any more wrong than using hash indexes right after
> recovery, crash recovery or otherwise. It's certainly broken, but I
> don't see much value in a partial fix. The bottom line is that hash
> indexes should be WAL-logged.

I know that's your thought; I'm just checking its everyone else's as
well. We go to great lengths elsewhere in the patch to avoid queries
returning incorrect results and there is a loss of capability as a
result. I don't want Hash index users to view this as a feature. I don't
feel too strongly, but it can be argued both ways, at least.

--
Simon Riggs www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2009-12-14 11:07:44 Re: Hot Standby, deferred conflict resolution for cleanup records (v2)
Previous Message Fred Janon 2009-12-14 11:04:07 pgAdmin III: timestamp displayed in what time zone?