Re: display offset along with block number in vacuum errors

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Mahendra Singh Thalor <mahi6run(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: display offset along with block number in vacuum errors
Date: 2020-08-17 06:07:39
Message-ID: CA+fd4k7t3+rEPQE5qu0Vie90rkNYN4_ujk1kfQADuVq3UbZm3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 15 Aug 2020 at 12:19, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Aug 14, 2020 at 4:06 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Mon, Aug 10, 2020 at 10:24 AM Masahiko Sawada
> > <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > >
> > > It's true that heap_page_is_all_visible() is called from only
> > > lazy_vacuum_page() but I'm concerned it would lead misleading since
> > > it's not actually removing tuples but just checking after vacuum. I
> > > guess that the errcontext should show what the process is actually
> > > doing and therefore help the investigation, so I thought VACUUM_HEAP
> > > might not be appropriate for this case. But I see also your point.
> > > Other vacuum error context phases match with vacuum progress
> > > information phrases. So in heap_page_is_all_visible (), I agree it's
> > > better to update the offset number and keep the phase VACUUM_HEAP
> > > rather than do nothing.
> > >
> >
> > Okay, I have changed accordingly and this means that the offset will
> > be displayed for the vacuum phase as well. Apart from this, I have
> > fixed all the comments raised by me in the attached patch. One thing
> > we need to think is do we want to set offset during heap_page_prune
> > when called from lazy_scan_heap? I think the code path for
> > heap_prune_chain is quite deep, so an error can occur in that path.
> > What do you think?
> >
>
> The reason why I have not included heap_page_prune related change in
> the patch is that I don't want to sprinkle this in every possible
> function (code path) called via vacuum especially if the probability
> of an error in that code path is low. But, I am fine if you and or
> others think that it is a good idea to update offset in
> heap_page_prune as well.

I agree to not try sprinkling it many places than necessity.

Regarding heap_page_prune(), I'm concerned a bit that
heap_page_prune() is typically the first function to check the tuple
visibility within the vacuum code. I've sometimes observed an error
with the message like "DETAIL: could not open file “pg_xact/00AB”: No
such file or directory" due to a tuple header corruption. I suspect
this message was emitted while checking tuple visibility in
heap_page_prune(). So I guess the likelihood of an error in that code
is not so low.

On the other hand, if we want to update the offset number in
heap_page_prune() we will need to expose some vacuum structs defined
as a static including LVRelStats. But I don't think it's a good idea.
The second idea I came up with is that we set another errcontext for
heap_page_prune(). Since heap_page_prune() could be called also by a
regular page scanning it would work fine for both cases, although
there will be extra overheads for both. What do you think?

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message yuzuko 2020-08-17 06:11:28 Re: Autovacuum on partitioned table (autoanalyze)
Previous Message Ashutosh Sharma 2020-08-17 06:05:06 Re: recovering from "found xmin ... from before relfrozenxid ..."