Re: SimpleLruTruncate() mutual exclusion

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: SimpleLruTruncate() mutual exclusion
Date: 2019-11-22 15:32:22
Message-ID: 20191122153222.jeb2jsezhso36obu@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Sun, Nov 17, 2019 at 10:14:26PM -0800, Noah Misch wrote:
>
> Though I did reproduce this bug, I'm motivated by the abstract problem more
> than any particular way to reproduce it. Commit 996d273 inspired me; by
> removing a GetCurrentTransactionId(), it allowed the global xmin to advance at
> times it previously could not. That subtly changed the concurrency
> possibilities. I think safe, parallel SimpleLruTruncate() is difficult to
> maintain and helps too rarely to justify such maintenance. That's why I
> propose eliminating the concurrency.

Sure, I see the point and the possibility for the issue itself, but of
course it's easier to reason about an issue I can reproduce :)

> I wonder about performance in a database with millions of small relations,
> particularly considering my intent to back-patch this. In such databases,
> vac_update_datfrozenxid() can be a major part of the VACUUM's cost. Two
> things work in our favor. First, vac_update_datfrozenxid() runs once per
> VACUUM command, not once per relation. Second, Autovacuum has this logic:
>
> * ... we skip
> * this if (1) we found no work to do and (2) we skipped at least one
> * table due to concurrent autovacuum activity. In that case, the other
> * worker has already done it, or will do so when it finishes.
> */
> if (did_vacuum || !found_concurrent_worker)
> vac_update_datfrozenxid();
>
> That makes me relatively unworried. I did consider some alternatives:

Btw, I've performed few experiments with parallel vacuuming of 10^4
small tables that are taking some small inserts, the results look like
this:

# with patch
# funclatency -u bin/postgres:vac_update_datfrozenxid

usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 3 |*** |
1024 -> 2047 : 38 |****************************************|
2048 -> 4095 : 15 |*************** |
4096 -> 8191 : 15 |*************** |
8192 -> 16383 : 2 |** |

# without patch
# funclatency -u bin/postgres:vac_update_datfrozenxid

usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 5 |**** |
1024 -> 2047 : 49 |****************************************|
2048 -> 4095 : 11 |******** |
4096 -> 8191 : 5 |**** |
8192 -> 16383 : 1 | |

In general it seems that latency tends to be a bit higher, but I don't
think it's significant.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-11-22 15:41:02 Re: tableam vs. TOAST
Previous Message Tom Lane 2019-11-22 14:37:07 Re: Ordering of header file inclusion