Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

From: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>, lubennikovaav(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org, pavan(dot)deolasee(at)gmail(dot)com, ibrar(dot)ahmad(at)gmail(dot)com
Subject: Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits
Date: 2021-01-16 15:11:29
Message-ID: 58359211-c505-1ee6-c5fd-78db84afb46f@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12.01.2021 22:30, Tomas Vondra wrote:
> Thanks. These patches seem to resolve the TOAST table issue, freezing
> it as expected. I think the code duplication is not an issue, but I
> wonder why heap_insert uses this condition:
>
>     /*
>      * ...
>      *
>      * No need to update the visibilitymap if it had all_frozen bit set
>      * before this insertion.
>      */
>     if (all_frozen_set && ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0))
>
> while heap_multi_insert only does this:
>
>     if (all_frozen_set) { ... }
>
> I haven't looked at the details, but shouldn't both do the same thing?

I decided to add this check for heap_insert() to avoid unneeded calls of
visibilitymap_set(). If we insert tuples one by one, we can only call
this once per page.
In my understanding, heap_multi_insert() inserts tuples in batches, so
it doesn't need this optimization.

>
>
> However, I've also repeated the test counting all-frozen pages in both
> the main table and TOAST table, and I get this:
>
> patched
> =======
>
> select count(*) from pg_visibility((select reltoastrelid from pg_class
> where relname = 't'));
>
>  count
> --------
>  100002
> (1 row)
>
>
> select count(*) from pg_visibility((select reltoastrelid from pg_class
> where relname = 't')) where not all_visible;
>
>  count
> --------
>       0
> (1 row)
>
> That is - all TOAST pages are frozen (as expected, which is good). But
> now there are 100002 pages, not just 100000 pages. That is, we're now
> creating 2 extra pages, for some reason. I recall Pavan reported
> similar issue with every 32768-th page not being properly filled, but
> I'm not sure if that's the same issue.
>
>
> regards
>

As Pavan correctly figured it out before the problem is that
RelationGetBufferForTuple() moves to the next page, losing free space in
the block:

> ... I see that a relcache invalidation arrives
> after 1st and then after every 32672th block is filled. That clears the
> rel->rd_smgr field and we lose the information about the saved target
> block. The code then moves to extend the relation again and thus
skips the
> previously less-than-half filled block, losing the free space in that
block.

The reason of this cache invalidation is vm_extend() call, which happens
every 32762 blocks.

RelationGetBufferForTuple() tries to use the last page, but for some
reason this code is under 'use_fsm' check. And COPY FROM doesn't use fsm
(see TABLE_INSERT_SKIP_FSM).

        /*
         * If the FSM knows nothing of the rel, try the last page before we
         * give up and extend.  This avoids one-tuple-per-page syndrome
during
         * bootstrapping or in a recently-started system.
         */
        if (targetBlock == InvalidBlockNumber)
        {
            BlockNumber nblocks = RelationGetNumberOfBlocks(relation);
            if (nblocks > 0)
                targetBlock = nblocks - 1;
        }

I think we can use this code without regard to 'use_fsm'. With this
change, the number of toast rel pages is correct. The patch is attached.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-Set-PD_ALL_VISIBLE-and-visibility-map-bits-in-COP-v7.patch text/x-patch 17.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2021-01-16 15:32:10 Re: trailing junk in numeric literals
Previous Message Vik Fearing 2021-01-16 14:08:48 Re: WIP: System Versioned Temporal Table