Re: remove spurious CREATE INDEX CONCURRENTLY wait

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, James Coleman <jtc331(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: remove spurious CREATE INDEX CONCURRENTLY wait
Date: 2020-11-16 18:24:46
Message-ID: 20201116182446.qcg3o6szo2zookyr@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Fri, Nov 13, 2020 at 09:25:40AM +0900, Michael Paquier wrote:
> On Thu, Nov 12, 2020 at 04:36:32PM +0100, Dmitry Dolgov wrote:
> > Interesting enough, similar discussion happened about vaccumFlags before
> > with the same conclusion that theoretically it's fine to update without
> > holding the lock, but this assumption could change one day and it's
> > better to avoid such risks. Having said that I believe it makes sense to
> > continue with locking. Are there any other opinions? I'll try to
> > benchmark it in the meantime.
>
> Thanks for planning some benchmarking for this specific patch. I have
> to admit that the possibility of switching vacuumFlags to use atomics
> is very appealing in the long term, with or without considering this
> patch, even if we had better be sure that this patch has no actual
> effect on concurrency first if atomics are not used in worst-case
> scenarios.

I've tried first to test scenarios where GetSnapshotData produces
significant lock contention and "reindex concurrently" implementation
with locks interferes with it. The idea I had is to create a test
function that constantly calls GetSnapshotData (perf indeed shows
significant portion of time spent on contended lock), and clash it with
a stream of "reindex concurrently" of an empty relation (which still
reaches safe_index check). I guess it could be considered as an
artificial extreme case. Measuring GetSnapshotData (or rather the
surrounding wrapper, to distinguish calls from the test function from
everything else) latency without reindex, with reindex and locks, with
reindex without locks should produce different "modes" and comparing
them we can make some conclusions.

Latency histograms without reindex (nanoseconds):

nsecs : count distribution
512 -> 1023 : 0 | |
1024 -> 2047 : 10001209 |****************************************|
2048 -> 4095 : 76936 | |
4096 -> 8191 : 1468 | |
8192 -> 16383 : 98 | |
16384 -> 32767 : 39 | |
32768 -> 65535 : 6 | |

The same with reindex without locks:

nsecs : count distribution
512 -> 1023 : 0 | |
1024 -> 2047 : 111345 | |
2048 -> 4095 : 6997627 |****************************************|
4096 -> 8191 : 18575 | |
8192 -> 16383 : 586 | |
16384 -> 32767 : 312 | |
32768 -> 65535 : 18 | |

The same with reindex with locks:

nsecs : count distribution
512 -> 1023 : 0 | |
1024 -> 2047 : 59438 | |
2048 -> 4095 : 6901187 |****************************************|
4096 -> 8191 : 18584 | |
8192 -> 16383 : 581 | |
16384 -> 32767 : 280 | |
32768 -> 65535 : 84 | |

Looks like with reindex without locks is indeed faster (there are mode
samples in lower time section), but not particularly significant to the
whole distribution, especially taking into account extremity of the
test.

I'll take a look at benchmarking of switching vacuumFlags to use
atomics, but as it's probably a bit off topic I'm going to attach
another version of the patch with locks and suggested changes. To which
I have one question:

> Michael Paquier <michael(at)paquier(dot)xyz> writes:

> I think that this should be in its own routine, and that we had better
> document that this should be called just after starting a transaction,
> with an assertion enforcing that.

I'm not sure which exactly assertion condition do you mean?

Attachment Content-Type Size
v4-0001-Rename-vaccumFlags-to-statusFlags.patch text/x-diff 18.1 KB
v4-0002-Avoid-spurious-CREATE-INDEX-CONCURRENTLY-waits.patch text/x-diff 8.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2020-11-16 18:36:51 Re: remove spurious CREATE INDEX CONCURRENTLY wait
Previous Message Tomas Vondra 2020-11-16 17:24:41 Re: planner support functions: handle GROUP BY estimates ?