Re: Refactoring the checkpointer's fsync request queue

From: Shawn Debnath <sdn(at)amazon(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Refactoring the checkpointer's fsync request queue
Date: 2019-01-31 05:59:38
Message-ID: 20190131055938.GA52540@f01898859afd.ant.amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I (finally) got a chance to go through these patches and they look
great. Thank you for working on this! Few comments:

- I do not see SmgrFileTag being defined or used like you mentioned in
your first email. I see RelFileNode still being used. Is this planned
for the future?
- Would be great to add a set of Tests for SimpleVector.

> For the 0001 patch, I'll probably want to reconsider the naming a it
> ("simple -> "specialized", "generic", ...?)

I think the name SimpleVector is fine, fits with the SimpleHash theme.
If the goal is to shorten it, perhaps PG prefix would suffice?

> 4. The protocol for forgetting relations etc is slightly different:
> if a file is found to be missing, AbsortFsyncRequests() and then probe
> to see if the segment number disappeared from the set (instead of
> cancel flags), though I need to test this case.

Can you explain this part a bit more? I am likely missing something in
the patch.

> I couldn't resist the urge to try porting pg_qsort() to this style.
> It seems to be about twice as fast as the original at sorting integers
> on my machine with -O2. I suppose people aren't going to be too
> enthusiastic about yet another copy of qsort in the tree, but maybe
> this approach (with a bit more work) could replace the Perl code-gen
> for tuple sorting. Then the net number of copies wouldn't go up, but
> this could be used for more things too, and it fits with the style of
> simplehash.h and simplevector.h. Thoughts?

+1 for avoiding duplicate code. Would it be acceptable to migrate the
rest of the usages to this model over time perhaps? Love to move this
patch forward.

I wonder if it might be better to introduce two different functions
catering to the two different use cases for forcing an immediate sync:

- sync a relation
smgrimmedsyncrel(SMgrRelation, ForkNumber)
- sync a specific segment
smgrimmedsyncseg(SMgrRelation, ForkNumber, SegmentNumber)

This will avoid having to specify InvalidSegmentNumber for majority of
the callers today.

--
Shawn Debnath
Amazon Web Services (AWS)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arseny Sher 2019-01-31 06:21:59 Too rigorous assert in reorderbuffer.c
Previous Message Amit Kapila 2019-01-31 05:41:35 Re: WIP: Avoid creation of the free space map for small tables