|From:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>|
|Subject:||Re: Check the number of potential synchronous standbys|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
"=?gb18030?B?1cXOxL3c?=" <757634191(at)qq(dot)com> writes:
> When the number of potential synchronous standbys is smaller than num_sync, such as 'FIRST 3 (1,2)', 'ANY 4 (1,2,3)' in the synchronous_standby_names, the processes will wait for synchronous replication forever.
> Obviously, it's not expected. I think return false and a error message may be better. And attached is a patch that implements the simple check.
Well, it's not *that* simple; this patch rejects cases like "ANY 2(*)"
which need to be accepted. That causes the src/test/recovery tests
to fail (you should have tried check-world).
I also observe that there's a test case in 007_sync_rep.pl which is
actually exercising the case you want to reject:
# Check that sync_state of each standby is determined correctly
# when num_sync exceeds the number of names of potential sync standbys
# specified in synchronous_standby_names.
'num_sync exceeds the num of potential sync standbys',
So it can't be said that nobody thought about this at all.
Now, I'm not convinced that this represents a useful use-case as-is.
However, because we can't know how many standbys may match "*",
it's clear that the code has to do something other than just
abort when the situation happens. Conceivably we could fail at
runtime (not GUC parse time) if the number of required standbys
exceeds the number available, rather than waiting indefinitely.
However, if standbys can come online dynamically, a wait involving
"*" might be satisfiable after awhile even if it isn't immediately.
On the whole, given the fuzziness around "*", I'm not sure that
it's easy to make this much better.
regards, tom lane
|Next Message||Tom Lane||2019-08-26 21:28:57||Re: old_snapshot_threshold vs indexes|
|Previous Message||Tomas Vondra||2019-08-26 20:23:25||Re: subscriptionCheck failures on nightjar|