Re: Write visibility map during CLUSTER/VACUUM FULL

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Write visibility map during CLUSTER/VACUUM FULL
Date: 2019-09-20 20:05:06
Message-ID: 20190920200506.45pcsi4e2avvw4ua@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-09-13 22:22:50 +0300, Alexander Korotkov wrote:
> On Thu, Sep 12, 2019 at 4:55 PM Alexander Korotkov
> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > On Wed, Sep 11, 2019 at 3:30 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > On Sun, Sep 1, 2019 at 1:37 PM Alexander Korotkov
> > > <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > > > I found it weird that CLUSTER/VACUUM FULL don't write visibility map.
> > > > Attached patch implements writing visibility map in
> > > > heapam_relation_copy_for_cluster().
> > > >
> > > > I've studied previous attempt to implement this [1]. The main problem
> > > > of that attempt was usage of existing heap_page_is_all_visible() and
> > > > visibilitymap_set() functions. These functions works through buffer
> > > > manager, while heap rewriting is made bypass buffer manager.
> > > >
> > > > In my patch visibility map pages are handled in the same way as heap
> > > > pages are.

Why do you want to do that? To me that doesn't immediately seems like a
good idea, and I've not seen justification for it in this thread. Did I
miss something?

> > > I haven't studied this patch in detail, but while glancing I observed
> > > that this doesn't try to sync the vm pages as we do for heap pages in
> > > the end (during end_heap_rewrite). Am I missing something?
> >
> > You're not missed anything. Yes, VM need sync. Will fix this. And I
> > just noticed I need a closer look to what is going on with TOAST.
>
> Attached patch syncs VM during end_heap_rewrite().
>
> However, VM for TOAST still isn't read. It appear to be much more
> difficult to write VM for TOAST, because it's written by insertion
> tuples one-by-one. Despite it seems to fill TOAST heap pages
> sequentially (assuming no FSM exists yet), it's quite hard to handle
> page-switching event with reasonable level of abstraction.
> Nevertheless, I find this patch useful in current shape. Even if we
> don't write VM for TOAST, it's still useful to do for regular heap.
> Additionally, one of key advantages of having VM is index-only scan,
> which don't work for TOAST anyway.

Have you looked at the discussion around
https://www.postgresql.org/message-id/20190408010427.4l63qr7h2fjcyp77%40alap3.anarazel.de
?

That's not quite the same thing as you're tackling here, but I wonder if
some of the logic could be the same. Especially for toast.

> +/* Write contents of VM page */
> +static void
> +rewrite_flush_vm_page(RewriteState state)
> +{
> + Assert(state->rs_vm_buffer_valid);
> +
> + if (state->rs_use_wal)
> + log_newpage(&state->rs_new_rel->rd_node,
> + VISIBILITYMAP_FORKNUM,
> + state->rs_vm_blockno,
> + state->rs_vm_buffer,
> + true);
> + RelationOpenSmgr(state->rs_new_rel);
> +
> + PageSetChecksumInplace(state->rs_vm_buffer, state->rs_vm_blockno);
> +
> + smgrextend(state->rs_new_rel->rd_smgr, VISIBILITYMAP_FORKNUM,
> + state->rs_vm_blockno, (char *) state->rs_vm_buffer, true);
> +
> + state->rs_vm_buffer_valid = false;
> +}
> +
> +/* Set VM flags to the VM page */
> +static void
> +rewrite_set_vm_flags(RewriteState state)
> +{
> + BlockNumber mapBlock = HEAPBLK_TO_MAPBLOCK(state->rs_blockno);
> + uint32 mapByte = HEAPBLK_TO_MAPBYTE(state->rs_blockno);
> + uint8 mapOffset = HEAPBLK_TO_OFFSET(state->rs_blockno);
> + char *map;
> + uint8 flags;
> +
> + if (mapBlock != state->rs_vm_blockno && state->rs_vm_buffer_valid)
> + rewrite_flush_vm_page(state);
> +
> + if (!state->rs_vm_buffer_valid)
> + {
> + PageInit(state->rs_vm_buffer, BLCKSZ, 0);
> + state->rs_vm_blockno = mapBlock;
> + state->rs_vm_buffer_valid = true;
> + }
> +
> + flags = (state->rs_all_visible ? VISIBILITYMAP_ALL_VISIBLE : 0) |
> + (state->rs_all_frozen ? VISIBILITYMAP_ALL_FROZEN : 0);
> +
> + map = PageGetContents(state->rs_vm_buffer);
> + map[mapByte] |= (flags << mapOffset);
> +}

I think it's a bad idea to copy this many VM implementation details into
rewriteheap.c.

> +/*
> + * Update rs_all_visible and rs_all_frozen flags according to the tuple. We
> + * use simplified check assuming that HeapTupleSatisfiesVacuum() should already
> + * set tuple hint bits.
> + */
> +static void
> +rewrite_update_vm_flags(RewriteState state, HeapTuple tuple)
> +{
> + TransactionId xmin;
> +
> + if (!state->rs_all_visible)
> + return;
> +
> + if (!HeapTupleHeaderXminCommitted(tuple->t_data))
> + {
> + state->rs_all_visible = false;
> + state->rs_all_frozen = false;
> + return;
> + }
> +
> + xmin = HeapTupleHeaderGetXmin(tuple->t_data);
> + if (!TransactionIdPrecedes(xmin, state->rs_oldest_xmin))
> + {
> + state->rs_all_visible = false;
> + state->rs_all_frozen = false;
> + return;
> + }
> +
> + if (!(tuple->t_data->t_infomask & HEAP_XMAX_INVALID) &&
> + !HEAP_XMAX_IS_LOCKED_ONLY(tuple->t_data->t_infomask))
> + {
> + state->rs_all_visible = false;
> + state->rs_all_frozen = false;
> + return;
> + }
> +
> + if (!state->rs_all_frozen)
> + return;
> +
> + if (heap_tuple_needs_eventual_freeze(tuple->t_data))
> + state->rs_all_frozen = false;
> +}

I don't think we should have yet another copy of logic determining
visibility. It has repeatedly proven hard to get right in the past, and
it'll not get better by having yet another copy. Especially not because
we've basically already done a HTSV (via
heapam_relation_copy_for_cluster), and we're now redoing a poor man's
version of it.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-09-20 20:14:25 Re: Unwanted expression simplification in PG12b2
Previous Message Alvaro Herrera 2019-09-20 19:56:57 Re: log bind parameter values on error