Re: POC: Cleaning up orphaned files using undo logs

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-05-01 04:38:14
Message-ID: CAA4eK1L_7TcMUx4SY144J_kEwzn+6-5BR_6t4-jUKeM+b++PLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thomas told me offlist that this email of mine didn't hit
pgsql-hackers, so trying it again by resending.

On Mon, Apr 29, 2019 at 3:51 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Apr 19, 2019 at 3:46 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Tue, Mar 12, 2019 at 6:51 PM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> > >
> >
> > Currently, undo branch[1] contain an older version of the (undo
> > interface + some fixup). Now, I have merged the latest changes from
> > the zheap branch[2] to the undo branch[1]
> > which can be applied on top of the undo storage commit[3]. For
> > merging those changes, I need to add some changes to the undo log
> > storage patch as well for handling the multi log transaction. So I
> > have attached two patches, 1) improvement to undo log storage 2)
> > complete undo interface patch which include 0006+0007 from undo
> > branch[1] + new changes on the zheap branch.
> >
> > [1] https://github.com/EnterpriseDB/zheap/tree/undo
> > [2] https://github.com/EnterpriseDB/zheap
> > [3] https://github.com/EnterpriseDB/zheap/tree/undo
> > (b397d96176879ed5b09cf7322b8d6f2edd8043a5)
> >
> > []
>
> Dilip has posted the patch for "undo record interface", next in series
> is a patch that handles transaction rollbacks (the machinery to
> perform undo actions) and background workers to manage undo.
>
> Transaction Rollbacks
> ----------------------------------
> We always perform rollback actions after cleaning up the current
> (sub)transaction. This will ensure that we perform the actions
> immediately after an error (and release the locks) rather than when
> the user issues Rollback command at some later point of time. We are
> releasing the locks after the undo actions are applied. The reason to
> delay lock release is that if we release locks before applying undo
> actions, then the parallel session can acquire the lock before us
> which can lead to deadlock.
>
> We promote the error to FATAL error if it occurred while applying undo
> for a subtransaction. The reason we can't proceed without applying
> subtransaction's undo is that the modifications made in that case must
> not be visible even if the main transaction commits. Normally, the
> backends that receive the request to perform Rollback (To Savepoint)
> applies the undo actions, but there are cases where it is preferable
> to push the requests to background workers. The main reasons to push
> the requests to background workers are (a) The request for a very
> large rollback, this will allow us to return control to users quickly.
> There is a guc rollback_overflow_size which indicates that rollbacks
> greater than the configured size are performed lazily by background
> workers. (b) While applying the undo actions, if there is an error, we
> push such a request to background workers.
>
> Undo Requests and Undo workers
> --------------------------------------------------
> To improve the efficiency of the rollbacks, we create three queues and
> a hash table for the rollback requests. A Xid based priority queue
> which will allow us to process the requests of older transactions and
> help us to move oldesdXidHavingUndo (this is a xid-horizon below which
> all the transactions are visible) forward. A size-based queue which
> will help us to perform the rollbacks of larger aborts in a timely
> fashion so that we don't get stuck while processing them during
> discard of the logs. An error queue to hold the requests for
> transactions that failed to apply its undo. The rollback hash table
> is used to avoid duplicate undo requests by backends and discard
> worker.
>
> Undo launcher is responsible for launching the workers iff there is
> some work available in one of the work queues and there are more
> workers available. The worker is launched to handle requests for a
> particular database. Each undo worker then start reading from one of
> the queues the requests for that particular database. A worker would
> peek into each queue for the requests from a particular database if it
> needs to switch a database in less than undo_worker_quantum ms (10s as
> default) after starting. Also, if there is no work, it lingers for
> UNDO_WORKER_LINGER_MS (10s as default). This avoids restarting the
> workers too frequently.
>
> The discard worker is responsible for discarding the undo log of
> transactions that are committed and all-visible or are rolled-back.
> It also registers the request for aborted transactions in the work
> queues. It iterates through all the active logs one-by-one and tries
> to discard the transactions that are old enough to matter.
>
> The details of how all of this works are described in
> src/backend/access/undo/README.UndoProcessing. The main idea to keep
> a readme is to allow reviewers to understand this patch, later we can
> decide parts of it to move to comments in code and others to main
> README of undo.
>
> Question: Currently, TwoPhaseFileHeader stores just TransactionId, so
> for the systems (like zheap) that support FullTransactionId, the
> two-phase transaction will be tricky to support as we need
> FullTransactionId during rollbacks. Is it a good idea to store
> FullTransactionId in TwoPhaseFileHeader?
>
> Credits:
> --------------
> Designed by: Andres Freund, Amit Kapila, Robert Haas, and Thomas Munro
> Author: Amit Kapila, Dilip Kumar, Kuntal Ghosh, and Thomas Munro
>
> This patch is based on the latest Dilip's patch for undo record
> interface. The branch can be accessed at
> https://github.com/EnterpriseDB/zheap/tree/undoprocessing
>
> Inputs on design/code are welcome.
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
undoprocessing_1.patch application/octet-stream 194.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-05-01 04:43:34 Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6
Previous Message Amit Kapila 2019-05-01 04:35:18 Re: Unhappy about API changes in the no-fsm-for-small-rels patch