Re: Reviewing freeze map code

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reviewing freeze map code
Date: 2016-06-10 08:46:44
Message-ID: CAD21AoCFy56srzXnbjGPXQXOAji97W0rX-eEuju+PjGjmJK2jw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 10, 2016 at 1:11 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Jun 9, 2016 at 5:48 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> Attached patch implements the above 2 functions. I have addressed the
>> comments by Sawada San and you in latest patch and updated the documentation
>> as well.
>
> I made a number of changes to this patch. Here is the new version.
>
> 1. The algorithm you were using for growing the array size is unsafe
> and can easily overrun the array. Suppose that each of the first two
> pages have some corrupt tuples, more than 50% of MaxHeapTuplesPerPage
> but less than the full value of MaxTuplesPerPage. Your code will
> conclude that the array does need to be enlarged after processing the
> first page. I switched this to what I consider the normal coding
> pattern for such problems.
>
> 2. The all-visible checks seemed to me to be incorrect and incomplete.
> I made the check match the logic in lazy_scan_heap.
>
> 3. Your 1.0 -> 1.1 upgrade script was missing copies of the REVOKE
> statements you added to the 1.1 script. I added them.
>
> 4. The tests as written were not safe under concurrency; they could
> return spurious results if the page changed between the time you
> checked the visibility map and the time you actually examined the
> tuples. I think people will try running these functions on live
> systems, so I changed the code to recheck the VM bits after locking
> the page. Unfortunately, there's either still a concurrency-related
> problem here or there's a bug in the all-frozen code itself because I
> once managed to get pg_check_frozen('pgbench_accounts') to return a
> TID while pgbench was running concurrently. That's a bit alarming,
> but since I can't reproduce it I don't really have a clue how to track
> down the problem.
>
> 5. I made various cosmetic improvements.
>
> If there are not objections, I will go ahead and commit this tomorrow,
> because even if there is a bug (see point #4 above) I think it's
> better to have this in the tree than not. However, code review and/or
> testing with these new functions seems like it would be an extremely
> good idea.
>

Thank you for working on this.
Here are some minor comments.

---
+/*
+ * Return the TIDs of not-all-visible tuples in pages marked all-visible

If there is even one non-visible tuple in pages marked all-visible,
the database might be corrupted.
Is it better "not-visible" or "non-visible" instead of "not-all-visible"?
---
Do we need to check page header flag?
I think that database also might be corrupt in case where there is
non-visible tuple in page set PD_ALL_VISIBLE.
We could emit the WARNING log in such case.

Also, using attached tool which allows us to set spurious visibility
map status without actual modifying the tuple , I manually made the
some situations where database is corrupted and tested it, but ISTM
that this tool works fine.
It doesn't mean proposing as a new feature of course, but please use
it as appropriate.

Regards,

--
Masahiko Sawada

Attachment Content-Type Size
corrupted_test.sql application/octet-stream 1.7 KB
set_spurious_vm_status.patch binary/octet-stream 3.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-06-10 08:52:58 Re: Online DW
Previous Message Michael Paquier 2016-06-10 08:39:59 Re: [BUG] pg_basebackup from disconnected standby fails