Re: Temporary tables versus wraparound... again

From: Greg Stark <stark(at)mit(dot)edu>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Noah Misch <noah(at)leadboat(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Temporary tables versus wraparound... again
Date: 2023-04-17 21:40:07
Message-ID: CAM-w4HPSygr7AL+CmmkOFLx4NO41ZAfLoSK4ZG0CDO=SrqvPEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 5 Apr 2023 at 13:42, Andres Freund <andres(at)anarazel(dot)de> wrote:

>
> Somehow it doesn't feel right to use vac_update_relstats() in
> heapam_handler.c.
>
> I also don't like that your patch references
> heapam_relation_nontransactional_truncate in AddNewRelationTuple() - we
> shouldn't add more comments piercing tableam than necessary.

I'm really puzzled because this does look like it was in the last
patch on the mailing list archive. But it's definitely not the code I
have here. I guess I did some cleanup that I never posted, so sorry.

I've attached patches using GetOldestNonRemovableTransactinId() and it
seems to have fixed the race condition here. At least I can't
reproduce it any more.

> Not if you determine a relation specific xmin, and the relation is not a
> shared relation.
>
> ISTM that the problem here really is that you're relying on RecentXmin, rather
> than computing something more accurate. Why not use
> GetOldestNonRemovableTransactionId(rel) - It's a bit more expensive, but I
> don't think it'll matter compared to the cost of truncating the relation.

I am a bit nervous about the overhead here because if your transaction
touched *any* temporary tables then this gets called for *every*
temporary table with ON COMMIT DELETE. That could be a lot and it's
not obvious to users that having temporary tables will impose an
overhead even if they're not actually using them.

So I went ahead and used GetOldestNonRemovableTransactionId and tried
to do some profiling. But this is on a cassert enabled build with -O0
so it's not serious profiling. I can repeat it on a real build if it
matters. But it's been a long time since I've read gprof output. This
is for -F PreCommit_on_commit_actions so the percentages are as a
percent of just the precommit cleanup:

index % time self children called name
0.00 0.00 10102/10102 CommitTransaction (1051)
[1] 100.0 0.01 31.47 10102 PreCommit_on_commit_actions [1]
0.01 31.43 10100/10100 heap_truncate [2]
0.00 0.03 1005050/1005260 lappend_oid [325]
-----------------------------------------------
0.01 31.43 10100/10100 PreCommit_on_commit_actions [1]
[2] 99.9 0.01 31.43 10100 heap_truncate [2]
0.09 27.30 1005050/1005050 heap_truncate_one_rel [3]
0.20 3.57 1005050/6087120 table_open <cycle 1> [465]
0.01 0.22 1005050/6045137 table_close [48]
0.00 0.03 1005050/1017744 lappend [322]
0.01 0.00 10100/10100 heap_truncate_check_FKs [425]
-----------------------------------------------
0.09 27.30 1005050/1005050 heap_truncate [2]
[3] 87.0 0.09 27.30 1005050 heap_truncate_one_rel [3]
0.02 12.23 1005050/1005050 RelationTruncateIndexes [5]
0.06 10.08 1005050/1005050 ResetVacStats [7]
0.03 4.89 1005050/1005050
table_relation_nontransactional_truncate [12]

I think this is saying that more than half the time is being spent
just checking for indexes. There were no indexes on these temporary
tables. Does not having any indexes cause the relcache treat it as a
cache miss every time?

0.06 10.08 1005050/1005050 heap_truncate_one_rel [3]
[7] 32.2 0.06 10.08 1005050 ResetVacStats [7]
0.02 3.83 1005050/1005250 SearchSysCacheCopy [16]
0.20 3.57 1005050/6087120 table_open <cycle 1> [465]
0.01 2.02 1005050/1005050 heap_inplace_update [35]
0.01 0.22 1005050/6045137 table_close [48]
0.00 0.20 1005050/1005150
GetOldestNonRemovableTransactionId [143]
0.00 0.01 1005050/1005150 GetOurOldestMultiXactId [421]
0.00 0.00 1005050/1008750 ObjectIdGetDatum [816]

I guess this means GetOldestNonRemovableTransactionId is not the main
cost in ResetVacStats though I don't understand why the syscache would
be so slow.

I think there's a facility for calculating the Horizons and then
reusing them for a while but I don't see how to use that here. It
would be appropriate I think.

>
> > Honestly I'm glad I wrote the test because it was hard to know whether
> > my code was doing anything at all without it (and it wasn't in the
> > first cut...) But I don't think there's much value in having it be in
> > the regression suite. We don't generally write tests to ensure that a
> > specific internal implementation behaves in the specific way it was
> > written to.
>
> To me it seems important to test that your change actually does what it
> intends to. Possibly the test needs to be relaxed some, but I do think we want
> tests for the change.
>
> Greetings,
>
> Andres Freund

--
greg

Attachment Content-Type Size
v9-0003-Add-test-for-truncating-temp-tables-advancing-rel.patch text/x-patch 6.1 KB
v9-0001-Add-warnings-when-old-temporary-tables-are-found-.patch text/x-patch 9.0 KB
v9-0002-Update-relfrozenxmin-when-truncating-temp-tables.patch text/x-patch 7.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-04-17 21:44:10 Re: Direct I/O
Previous Message Nathan Bossart 2023-04-17 21:21:41 Re: Clean up hba.c of code freeing regexps