Re: TOAST - why separate visibility map

From: Virender Singla <virender(dot)cse(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TOAST - why separate visibility map
Date: 2021-11-25 14:22:01
Message-ID: CAM6Zo8x+b3-uKjQJJC6O3Y2CqqpNSVbr0FC6JjZsqcFKVOsyJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Given the size of toasted data, the overhead is unlikely to be a
significant overhead. It's much more an issue for the main table, where
narrow rows are common."

Completely agree, row size should not be a big concern for toast tables.

However write amplification will happen with vacuum freeze where
transactions id need to freeze in wider toast table tuples as well. I have
not explored if TOAST has separate hint bits info as well. In that case it
means normal vacuum (or SELECT after WRITE) has to completely rewrite the
big toast table tuples along with the small main table to set the hint bits
(commit/rollback).

I believe B tree Index does not contain any seperate visibility info so
that means the only work VACUUM does on Indexes is cleaning up dead tuples.

With maintaining one visibility info, above operations could be way faster.
However now the main table and TOAST vacuuming process will be glued
together where optimization can be thought about like two synchronized
threads working together for main and TOAST table to do the cleanup job.
Agree that hot updates are gone in TOAST if there is a common VM.

Overall this looks complex.

On Sat, Nov 20, 2021 at 9:46 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On November 19, 2021 12:31:00 PM PST, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> wrote:
> >> It might be feasible to drop the visibility map for toast tables,
> though.
>
> > I think it be a bad idea - the VM is used by vacuum to avoid rereading
> already vacuumed ranges. Loosing that for large toast tables would be bad.
>
> Ah, right. I was thinking vacuuming depended on the other map fork,
> but of course it needs this one.
>
> In short, there are indeed good reasons why it works like this.
>
> regards, tom lane
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Marcos Pegoraro 2021-11-25 14:30:45 Re: pg_upgrade and publication/subscription problem
Previous Message Euler Taveira 2021-11-25 14:08:44 Re: row filtering for logical replication