Re: Reviewing freeze map code

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reviewing freeze map code
Date: 2016-06-10 06:28:26
Message-ID: CAA4eK1+ALN59onLv7xfwig4HHLtMoXdfCANm3gAGptODzxLMKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 10, 2016 at 8:27 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:

>
>
> On June 9, 2016 7:46:06 PM PDT, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> >On Fri, Jun 10, 2016 at 8:08 AM, Andres Freund <andres(at)anarazel(dot)de>
> >wrote:
> >
> >> On 2016-06-09 19:33:52 -0700, Andres Freund wrote:
> >> > I played with it for a while, and besides
> >> > finding intentionally caused corruption, it didn't flag anything
> >> > (besides crashing on a standby, as in 2)).
> >>
> >> Ugh. Just sends after I sent that email:
> >>
> >> oid | t_ctid
> >> ------------------+--------------
> >> pgbench_accounts | (889641,33)
> >> pgbench_accounts | (893854,56)
> >> pgbench_accounts | (924226,13)
> >> pgbench_accounts | (1073457,51)
> >> pgbench_accounts | (1084904,16)
> >> pgbench_accounts | (1111996,26)
> >> (6 rows)
> >>
> >> oid | t_ctid
> >> -----+--------
> >> (0 rows)
> >>
> >> oid | t_ctid
> >> ------------------+--------------
> >> pgbench_accounts | (739198,13)
> >> pgbench_accounts | (887254,11)
> >> pgbench_accounts | (1050391,6)
> >> pgbench_accounts | (1158640,46)
> >> pgbench_accounts | (1238067,18)
> >> pgbench_accounts | (1273282,22)
> >> pgbench_accounts | (1355816,54)
> >> pgbench_accounts | (1361880,33)
> >> (8 rows)
> >>
> >>
> >Is this output of pg_check_visible() or pg_check_frozen()?
>
> Unfortunately I don't know. I was running a union of both, I didn't really
> expect to hit an issue... I guess I'll put a PANIC in the relevant places
> and check whether I cab reproduce.
>
>

I have tried in multiple ways by running pgbench with read-write tests, but
could not see any such behaviour. I have tried by even crashing and
restarting the server and then again running pgbench. Do you see these
records on master or slave?

While looking at code in this area, I observed that during replay of
records (heap_xlog_delete), we first clear the vm, then update the page.
So we don't have Buffer lock while updating the vm where as in the patch
(collect_corrupt_items()), we are relying on the fact that for clearing vm
bit one needs to acquire buffer lock. Can that cause a problem?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-06-10 06:39:24 Re: Reviewing freeze map code
Previous Message Andres Freund 2016-06-10 02:57:53 Re: Reviewing freeze map code