Re: WIP: Avoid creation of the free space map for small tables

From: John Naylor <jcnaylor(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Avoid creation of the free space map for small tables
Date: 2018-11-26 10:16:50
Message-ID: CAJVSVGXLk6DqXGgH3KVUJYO0=eMBYkSBDPPuee-YvAHRf5D5Yw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/24/18, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Fri, Nov 23, 2018 at 11:56 AM John Naylor <jcnaylor(at)gmail(dot)com> wrote:
>> On 11/16/18, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> >
>> > 3.
>> > static int fsm_set_and_search(Relation rel, FSMAddress addr, uint16
>> > slot,
>> > - uint8 newValue, uint8 minValue);
>> > + uint8 newValue, uint8 minValue);
>> >
>> > This appears to be a spurious change.
>>
>> 2 and 3 are separated into 0001.
>>
>
> Is the point 3 change related to pgindent? I think even if you want
> these, then don't prepare other patches on top of this, keep it
> entirely separate.

There are some places in the codebase that have spaces after tabs for
no apparent reason (not to line up function parameters or pointer
declarations). pgindent hasn't been able to fix this. If you wish, you
are of course free to apply 0001 separately at any time.

>> Also, we don't quite have a consensus on the threshold value, but I
>> have set it to 4 pages for v8. If this is still considered too
>> expensive (and basic tests show it shouldn't be), I suspect it'd be
>> better to interleave the available block numbers as described a couple
>> days ago than lower the threshold further.
>>
>
> Can you please repeat the copy test you have done above with
> fillfactor as 20 and 30?

I did two kinds of tests. The first had a fill-factor of 10 [1], the
second had the default storage, but I prevented the backend from
caching the target block [2], to fully exercise the free space code.
Would you like me to repeat the first one with 20 and 30? And do you
think it is useful enough to test the copying of 4 blocks and not
smaller numbers?

> Few more comments:
> -------------------------------
> 1. I think we can add some test(s) to test the new functionality, may
> be something on the lines of what Robert has originally provided as an
> example of this behavior [1].

Maybe the SQL script attached to [3] (which I probably based on
Robert's report) can be cleaned up into a regression test.

> 3.
> GetPageWithFreeSpace(Relation rel, Size spaceNeeded)
> {
> ..
> + if (target_block == InvalidBlockNumber &&
> + rel->rd_rel->relkind == RELKIND_RELATION)
> + {
> + nblocks = RelationGetNumberOfBlocks(rel);
> +
> + if (nblocks > HEAP_FSM_CREATION_THRESHOLD)
> + {
> + /*
> + * If the FSM knows nothing of the rel, try the last page before
> + * we give up and extend. This avoids one-tuple-per-page syndrome
> + * during bootstrapping or in a recently-started system.
> + */
> + target_block = nblocks - 1;
> + }
> ..
> }
>
> Moving this check inside GetPageWithFreeSpace has one disadvantage, we
> will always consider last block which can have some inadvertent
> effects. Consider when this function gets called from
> RelationGetBufferForTuple just before extension, it can consider to
> check for the last block even though that is already being done in the
> begining when GetPageWithFreeSpace was called. I am not completely
> sure how much this is a case to worry because it will help to check
> last block when the same is concurrently added and FSM is not updated
> for same. I am slightly worried because the unpatched code doesn't
> care for such case and we have no intention to change this behaviour.
> What do you think?

I see what you mean. If the other backend extended by 1 block, the
intention is to keep it out of the FSM at first, and by extension, not
visible in other ways. The comment implies that's debatable, but I
agree we shouldn't change that without a reason to. One simple idea is
add a 3rd boolean parameter to GetPageWithFreeSpace() to control
whether it gives up if the FSM fork doesn't indicate free space, like

if (target_block == InvalidBlockNumber &&
rel->rd_rel->relkind == RELKIND_RELATION &&
!check_fsm_only)
{
nblocks = RelationGetNumberOfBlocks(rel);

> 4. You have mentioned above that "system catalogs created during
> bootstrap still have a FSM if they have any data." and I can also see
> this behavior, have you investigated this point further?

Code reading didn't uncover the cause. I might have to step through
with a debugger or something similar. I should find time for that next
month.

> 5. Your logic to update FSM on standby seems okay, but can you show
> some tests which proves its sanity?

I believe to convince myself it was working, I used the individual
commands in the sql file in [3], then used the size function on the
secondary. I'll redo that to verify.

--
[1] https://www.postgresql.org/message-id/CAJVSVGWCRMyi8sSqguf6PfFcpM3hwNY5YhPZTt-8Q3ZGv0UGYw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAJVSVGWMXzsqYpPhO3Snz4n5y8Tq-QiviuSCKyB5czCTnq9rzA%40mail.gmail.com
[3] https://www.postgresql.org/message-id/CAJVSVGWvB13PzpbLEecFuGFc5V2fsO736BsdTakPiPAcdMM5tQ%40mail.gmail.com

-John Naylor

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message REIX, Tony 2018-11-26 11:05:13 RE: Shared Memory: How to use SYSV rather than MMAP ?
Previous Message Michael Paquier 2018-11-26 10:04:20 Re: Continue work on changes to recovery.conf API