Re: POC: Cleaning up orphaned files using undo logs

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-08-22 05:34:37
Message-ID: CAFiTN-sodgCSoPBZP6UG=UKBRDismYVRzx2hin0h2fysV3Wc0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 22, 2019 at 10:24 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2019-08-22 10:19:04 +0530, Dilip Kumar wrote:
> > On Thu, Aug 22, 2019 at 9:58 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > >
> > > Hi,
> > >
> > > On 2019-08-22 09:51:22 +0530, Dilip Kumar wrote:
> > > > We can not know the complete size of the record even by reading the
> > > > header because we have a payload that is variable part and payload
> > > > length are stored in the payload header which again can be at random
> > > > offset.
> > >
> > > Wait, but that's just purely self inflicted damage, no? The initial
> > > length just needs to include the payload. And all this is not an issue
> > > anymore?
> > >
> > Actually, we store the undo length only at the end of the record and
> > that is for traversing the transaction's undo record chain during bulk
> > fetch. Ac such in the beginning of the record we don't have the undo
> > length. We do have uur_info but that just tell us which all optional
> > header are included in the record.
>
> But why? It makes a *lot* more sense to have it in the beginning. I
> don't think bulk-fetch really requires it to be in the end - we can
> still process records forward on a page-by-page basis.

Yeah, we can handle the bulk fetch as you suggested and it will make
it a lot easier. But, currently while registering the undo request
(especially during the first pass) we need to compute the from_urecptr
and the to_urecptr. And, for computing the from_urecptr, we have the
end location of the transaction because we have the uur_next in the
transaction header and that will tell us the end of our transaction
but we still don't know the undo record pointer of the last record of
the transaction. As of know, we read previous 2 bytes from the end of
the transaction to know the length of the last record and from there
we can compute the undo record pointer of the last record and that is
our from_urecptr.

Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2019-08-22 05:39:11 Re: Procedure support improvements
Previous Message Paul A Jungwirth 2019-08-22 04:54:53 Re: range_agg