Re: POC: Cleaning up orphaned files using undo logs

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-06-18 21:10:10
Message-ID: CA+Tgmoa+Q5ZDgLOFOrqa=9BtGcq9ZgP+yB2VRM-=Cny68fzpQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 18, 2019 at 2:07 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jun 18, 2019 at 7:31 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > [ new patches ]
>
> I tried writing some code that throws an error from an undo log
> handler and the results were not good.

I discovered another bothersome thing here: if you have a single
transaction that generates a bunch of undo records, the first one has
uur_dbid set correctly and the remaining records have uur_dbid set to
0. That means if you try to write a sanity check like if
(record->uur_dbid != MyDatabaseId) elog(ERROR, "the undo system messed
up") it fails.

The original idea of UnpackedUndoRecord was this: you would put a
bunch of data into an UnpackedUndoRecord that you wanted written to
undo, and the undo system would find ways to compress stuff out of the
on-disk representation by e.g. omitting the fork number if it's
MAIN_FORKNUM. Then, when you read an undo record, it would decompress
so that you ended up with the same UnpackedUndoRecord that you had at
the beginning. However, the inclusion of transaction headers has made
this a bit confusing: that stuff isn't being added by the user but by
the undo system itself. It's not very clear from the comments what the
contract is around these things: do you need to set uur_dbid to
MyDatabaseId when preparing to insert an undo record? Or can you just
leave it unset and then it'll possibly be set at decoding time? The
comments for the UnpackedUndoRecord structure don't explain this.

I'd really like to see this draw a cleaner distinction between the
stuff that the user is expected to set and the other stuff we deal
with internally to the undo subsystem. For example, suppose that
UnpackedUndoRecord didn't include any of the fields that are only
present in the transaction header. Maybe there's another structure,
like UndoTransactionHeader, that includes those fields. The client of
the undo subsystem creates a bunch of UnpackedUndoRecords and inserts
them. At undo time, the callback gets back an identical set of
UnpackedUndoRecords. And maybe it also gets a pointer to the
UndoTransactionHeader which contains all of the system-generated
stuff. Under this scheme, uur_xid, uur_xidepoch (which still need to
be combined into uur_fxid), uur_progress, uur_dbid, uur_next,
uur_prevlogstart, and uur_prevurp would all move out of the
UnpackedUndoRecord and into the UndoTransactionHeader. The user would
supply none of those things when inserting undo records, but the
rm_undo callback could examine those values if it wished.

A weaker approach would be to at least clean up the structure
definition so that the transaction-header fields set by the system are
clearly segregated from the per-record fields set by the
undo-inserter, with comments explaining that those fields don't need
to be set but will (or may?) be set at undo time. That would be better
than what we have right now - because it would hopefully make it much
more clear which fields need to be set on insert and which fields can
be expected to be set when decoding - but I think it's probably not
going far enough.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2019-06-18 21:15:25 Re: New EXPLAIN option: ALL
Previous Message Peter Eisentraut 2019-06-18 21:08:31 Re: New EXPLAIN option: ALL