Re: recovering from "found xmin ... from before relfrozenxid ..."

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, MBeena Emerson <mbeena(dot)emerson(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: recovering from "found xmin ... from before relfrozenxid ..."
Date: 2020-08-06 06:11:38
Message-ID: CAE9k0Pm7ZUDQacac-=EoJ70qpRZigjNr+amjyfnWfw+aiYoW3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 6, 2020 at 1:29 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Wed, Aug 5, 2020 at 9:42 AM Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
> > Yeah, it's being tested on the main table, not on a toast table. I've
> > removed this test-case and also restricted direct access to the toast
> > table using heap_force_kill/freeze functions. I think we shouldn't be
> > using these functions to do any changes in the toast table. We will
> > only use these functions with the main table and let VACUUM remove the
> > corresponding data chunks (pointed by the tuple that got removed from
> > the main table).
>
> I agree with removing the test case, but I disagree with restricting
> this from being used on the TOAST table. These are tools for experts,
> who may use them as they see fit. It's unlikely that it would be
> useful to use this on a TOAST table, I think, but not impossible.
>

Okay, If you want I can remove the restriction on a toast table, but,
then that means a user can directly remove the data chunks from the
toast table without changing anything in the main table. This means we
won't be able to query the main table as it will fail with an error
like "ERROR: unexpected chunk number ...". So, we will have to find
some way to identify the pointer to the chunks that got deleted from
the toast table and remove that pointer from the main table. We also
need to make sure that before we remove a tuple (pointer) from the
main table, we identify all the remaining data chunks pointed by this
tuple and remove them completely only then that table would be
considered to be in a good state. Now, I am not sure if we can
currently do all these things.

> > Another option would be to identify all the data chunks corresponding
> > to the tuple (ctid) being killed from the main table and remove them
> > one by one. We will only do this if the tuple from the main table that
> > has been marked as killed has an external storage. We will have to add
> > a bunch of code for this otherwise we can let VACUUM do this for us.
> > Let me know your thoughts on this.
>
> I don't think VACUUM will do anything for us automatically -- it isn't
> going to know that we force-killed the tuple in the main table.
> Normally, a tuple delete would have to set xmax on the TOAST tuples
> and then VACUUM would do its thing, but in this case that won't
> happen. So if you force-kill a tuple in the main table you would end
> up with a space leak in the TOAST table.
>
> The problem here is that one reason you might force-killing a tuple in
> the main table is because it's full of garbage. If so, trying to
> decode the tuple so that you can find the TOAST pointers might crash
> or error out, or maybe that part will work but then you'll error out
> trying to look up the corresponding TOAST tuples, either because the
> values are not valid or because the TOAST table itself is generally
> hosed in some way. So I think it is probably best if we keep this tool
> as simple as possible, with as few dependencies as we can, and
> document the possible negative outcomes of using it.

I completely agree with you.

It's not
> impossible to recover from a space-leak like this; you can always move
> the data into a new table with CTAS and then drop the old one. Not
> sure whether CLUSTER or VACUUM FULL would also be sufficient.
>

Yeah, I think, we can either use CTAS or VACUUM FULL, both look fine.

> Separately, we might want to add a TOAST-checker to amcheck, or
> enhance the heap-checker Mark is working on, and one of the things it
> could do is check for TOAST entries to which nothing points. Then if
> you force-kill tuples in the main table you could also use that tool
> to look for things in the TOAST table that ought to also be
> force-killed.
>

Okay, good to know that. Thanks for sharing this info.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-08-06 06:27:59 Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration
Previous Message Ashutosh Sharma 2020-08-06 06:03:04 Re: recovering from "found xmin ... from before relfrozenxid ..."