Re: POC: Cleaning up orphaned files using undo logs

From: Antonin Houska <ah(at)cybertec(dot)at>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2021-01-18 13:15:46
Message-ID: 46514.1610975746@antos
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:

> Thanks for the updated patch. As I've mentioned off the list I'm slowly
> looking through it with the intent to concentrate on undo progress
> tracking. But before I will post anything I want to mention couple of
> strange issues I see, otherwise I will forget for sure. Maybe it's
> already known, but running several times 'make installcheck' against a
> freshly build postgres with the patch applied from time to time I
> observe various errors.
>
> This one happens on a crash recovery, seems like
> UndoRecordSetXLogBufData has usr_type = USRT_INVALID and is involved in
> the replay process:
>
> TRAP: FailedAssertion("page_offset + this_page_bytes <= uph->ud_insertion_point", File: "undopage.c", Line: 300)
> postgres: startup recovering 000000010000000000000012(ExceptionalCondition+0xa1)[0x558b38b8a350]
> postgres: startup recovering 000000010000000000000012(UndoPageSkipOverwrite+0x0)[0x558b38761b7e]
> postgres: startup recovering 000000010000000000000012(UndoReplay+0xa1d)[0x558b38766f32]
> postgres: startup recovering 000000010000000000000012(XactUndoReplay+0x77)[0x558b38769281]
> postgres: startup recovering 000000010000000000000012(smgr_redo+0x1af)[0x558b387aa7bd]
>
> This one is somewhat similar:
>
> TRAP: FailedAssertion("page_offset >= SizeOfUndoPageHeaderData", File: "undopage.c", Line: 287)
> postgres: undo worker for database 36893 (ExceptionalCondition+0xa1)[0x5559c90f1350]
> postgres: undo worker for database 36893 (UndoPageOverwrite+0xa6)[0x5559c8cc8ae3]
> postgres: undo worker for database 36893 (UpdateLastAppliedRecord+0xbe)[0x5559c8ccd008]
> postgres: undo worker for database 36893 (smgr_undo+0xa6)[0x5559c8d11989]

Well, on repeated run of the test I could also hit the first one. I could fix
it and will post a new version of the patch (along with some other small
changes) this week.

> There are also here and there messages about not found undo files:
>
> ERROR: cannot open undo segment file 'base/undo/000008.0000020000': No such file or directory
> WARNING: failed to undo transaction

I don't see this one in the log so far, will try again.

Thanks for the report!

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-01-18 13:23:46 Re: list of extended statistics on psql
Previous Message Bharath Rupireddy 2021-01-18 13:03:42 Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit