Re: POC: Cleaning up orphaned files using undo logs

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-07-11 04:31:35
Message-ID: CAA4eK1LEKyPZD5Dy4j1u2smUUyMzxgC2YLj8E+aJpsvG7sVJYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 10, 2019 at 10:06 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Wed, Jul 10, 2019 at 2:32 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > As of now, after we finish executing the rollback actions, the entry
> > from the hash table is removed. Now, at a later time (when queues are
> > full and we want to insert a new entry) when we access the queue entry
> > (to check whether we can remove it) corresponding to the removed hash
> > table entry, will it be safe to access it? The hash table entry might
> > have been freed or would have been reused as some other entry by the
> > time we try to access it.
>
> Hmm, yeah, that's a problem. I think we could possibly fix this by
> having the binary heaps just store a FullTransactionId rather than a
> pointer to the RollBackHashEntry. Then, if you get a
> FullTransactionId from the binary heap, you just do a hash table
> lookup to find the RollBackHashEntry instead of accessing it directly.
> If it doesn't exist, then you can just discard the entry: it's for
> some old transaction that's no longer relevant.
>
> However, there are a few problems with that idea. One is that I see
> that you've made the hash table keyed by full_xid + start_urec_ptr
> rather than just full_xid, so if the queues just point to an XID, it's
> not enough to find the hash table entry. The comment claims that this
> is necessary because "in the same transaction, there could be rollback
> requests for both logged and unlogged relations," but I don't
> understand why that means we need start_urec_ptr in the hash table
> key. It would seem more natural to me to have a single entry that
> covers both the logged and the unlogged undo for that transaction.
>

The data for logged and unlogged undo are in separate logs. So, the
discard worker can encounter them at different times. It is quite
possible that by the time it encounters the second request, some undo
worker is already half-way processing the first request. It might be
feasible to combine them during foreground work, but after startup or
some other times when discard worker has to register the request, it
won't be feasible to have one entry or at least we need more smarts to
ensure that we can always edit the hash table entry at later time to
append the request. I have thought about keep full_xid +
persistence_level/undo_category as a key, but as we anyway need
start_ptr for the request, so it seems appealing to use the same.
Also, even if we try to support one entry for logged and unlogged
undo, it won't be always possible to have one request for it as is the
case explained for discard worker.

> (Incidentally, I don't think it's correct that RollbackHashEntry
> starts with FullTransactionId full_xid + UndoRecPtr start_uprec_ptr
> declared separately; I think it should start with RollbackHashKey -
> although if we change the key back to just a FullTransactionId then we
> don't need to worry separately about fixing this issue.)
>

Agreed.

It seems before we analyze or discuss in detail the other solutions
related to dangling entries, it is better to investigate the rbtree
idea you and Andres came up with as on a quick look it seems that
might avoid creating the dangling entries at the first place.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2019-07-11 04:53:04 Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Previous Message Yugo Nagata 2019-07-11 04:28:04 Re: Implementing Incremental View Maintenance