Re: POC: Cleaning up orphaned files using undo logs

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-08-22 16:14:10
Message-ID: CAFiTN-tjXYS2pWf5yR5UMLQf86rS8hmrSUoPFc53R5WFidQPrQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 22, 2019 at 9:21 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Thu, Aug 22, 2019 at 7:34 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > On Thu, Aug 22, 2019 at 1:34 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > Yeah, we can handle the bulk fetch as you suggested and it will make
> > > it a lot easier. But, currently while registering the undo request
> > > (especially during the first pass) we need to compute the from_urecptr
> > > and the to_urecptr. And, for computing the from_urecptr, we have the
> > > end location of the transaction because we have the uur_next in the
> > > transaction header and that will tell us the end of our transaction
> > > but we still don't know the undo record pointer of the last record of
> > > the transaction. As of know, we read previous 2 bytes from the end of
> > > the transaction to know the length of the last record and from there
> > > we can compute the undo record pointer of the last record and that is
> > > our from_urecptr.=
> >
> > I don't understand this. If we're registering an undo request at "do"
> > time, we don't need to compute the starting location; we can just
> > remember the UndoRecPtr of the first record we inserted. If we're
> > reregistering an undo request after a restart, we can (and, I think,
> > should) work forward from the discard location rather than backward
> > from the insert location.
>
> Right, we work froward from the discard location. So after the
> discard location, while traversing the undo log when we encounter an
> aborted transaction we need to register its rollback request. And,
> for doing that we need 1) start location of the first undo record . 2)
> start location of the last undo record (last undo record pointer).
>
> We already have 1). But we have to compute 2). For doing that if we
> unpack the first undo record we will know the start of the next
> transaction. From there if we read the last two bytes then that will
> have the length of the last undo record of our transaction. So we can
> compute 2) with below formula
>
> start of the last undo record = start of the next transaction - length
> of our transaction's last record.

Maybe I am saying that because I am just thinking how the requests are
registered as per the current code. But, those requests will
ultimately be used for collecting the record by the bulk fetch. So if
we are planning to change the bulk fetch to read forward then maybe we
don't need the valid last undo record pointer because that we will
anyway get while processing forward. So just knowing the end of the
transaction is sufficient for us to know where to stop. I am not sure
if this solution has any problem. Probably I should think again in
the morning when my mind is well-rested.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-08-22 16:19:06 Re: mingw32 floating point diff
Previous Message Anastasia Lubennikova 2019-08-22 16:13:05 Re: standby recovery fails (tablespace related) (tentative patch and discussion)