Re: POC: Cleaning up orphaned files using undo logs

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Cleaning up orphaned files using undo logs
Date: 2019-08-22 17:15:55
Message-ID: CAFiTN-v+fPFDgzoxFtZ5Eg3z9YXZjsXj86po9E3XB9Ra68zoVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 22, 2019 at 9:55 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi
>
> On August 22, 2019 9:14:10 AM PDT, Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > But, those requests will
> >ultimately be used for collecting the record by the bulk fetch. So if
> >we are planning to change the bulk fetch to read forward then maybe we
> >don't need the valid last undo record pointer because that we will
> >anyway get while processing forward. So just knowing the end of the
> >transaction is sufficient for us to know where to stop. I am not sure
> >if this solution has any problem. Probably I should think again in
> >the morning when my mind is well-rested.
>
> I don't think we can easily do so for bulk apply without incurring significant overhead. It's pretty cheap to read in forward order and then process backwards on a page level - but for an entire transactions undo the story is different. We can't necessarily keep all of it in memory, so we'd have to read the undo twice to find the end. Right?
>
I was not talking about the entire transaction, I was also telling
about the page level as you suggested. I was just saying that we may
not need the start position of the last undo record of the transaction
for registering the rollback request (which we currently do).
However, we need to know the end of the transaction to know the last
page from which we need to start reading forward.

Let me explain with an example

Transaction1
first, undo start at 10
first, undo end at 100
second, undo start at 101
second, undo end at 200
......
last, undo start at 1000
last, undo end at 1100

Transaction2
first, undo start at 1101
first, undo end at 1200
second, undo start at 1201
second, undo end at 1300

Suppose we want to register the request for Transaction1. Then
currently we need to know the start undo record pointer (10 as per
above example) and the last undo record pointer (1000). But, we only
know the start undo record pointer(10) and the start of the next
transaction(1101). So for calculating the start of the last record,
we use 1101 - 101 (length of the last record store 2 bytes before
1101).

So, now I am saying that maybe we don't need to compute the start of
last undo record (1000) because it's enough to know the end of the
last undo record(1100). Because on whichever page the last undo
record ends, we can start from that page and read forward on that
page.

* All numbers I used in the above example can be considered as undo
record pointers.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2019-08-22 17:20:48 Re: Cleanup isolation specs from unused steps
Previous Message Andres Freund 2019-08-22 16:24:58 Re: POC: Cleaning up orphaned files using undo logs