Re: Broken hint bits (freeze)

From: Dmitriy Sarafannikov <dsarafannikov(at)yandex(dot)ru>
To: Dmitriy Sarafannikov <dsarafannikov(at)yandex(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org, Borodin Vladimir <root(at)simply(dot)name>
Subject: Re: Broken hint bits (freeze)
Date: 2017-05-24 11:27:52
Message-ID: 72B8C80D-6F47-40F7-AECD-34008F98E55E@yandex.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We found that this problem appears also on shards with enabled checksums.
This shard has 1st timeline, which means there was no switchover after upgrade to 9.6.

xdb11f(master)=# select pg_current_xlog_location(), pg_xlogfile_name(pg_current_xlog_location());
pg_current_xlog_location | pg_xlogfile_name
--------------------------+--------------------------
30BA/5966AD38 | 00000001000030BA00000059
(1 row)

xdb11f(master)=# select * from page_header(get_raw_page(‘mytable', 1787));
lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
---------------+----------+-------+-------+-------+---------+----------+---------+-----------
1F43/8C432C60 | -3337 | 5 | 256 | 304 | 8192 | 8192 | 4 | 0
(1 row)

xdb11h(replica)=# select * from page_header(get_raw_page(‘mytable', 1787));
lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
---------------+----------+-------+-------+-------+---------+----------+---------+-----------
1B28/45819C28 | -17617 | 5 | 256 | 304 | 8192 | 8192 | 4 | 0
(1 row)

xdb11e(replica)=# select * from page_header(get_raw_page('mytable', 1787));
lsn | checksum | flags | lower | upper | special | pagesize | version | prune_xid
---------------+----------+-------+-------+-------+---------+----------+---------+-----------
1B28/45819C28 | -17617 | 5 | 256 | 304 | 8192 | 8192 | 4 | 0
(1 row)

Master has newer page version and freeze bits.

xdb11f(master)=# select t_xmin, t_infomask::bit(32) & X'0300'::int::bit(32) from heap_page_items(get_raw_page(‘mytable', 1787)) where lp = 42;
t_xmin | ?column?
-----------+----------------------------------
516651778 | 00000000000000000000001100000000
(1 row)

xdb11h(replica)=# select t_xmin, t_infomask::bit(32) & X'0300'::int::bit(32) from heap_page_items(get_raw_page('mytable', 1787)) where lp = 42;
t_xmin | ?column?
-----------+----------------------------------
516651778 | 00000000000000000000000000000000
(1 row)

xdb11e(replica)=# select t_xmin, t_infomask::bit(32) & X'0300'::int::bit(32) from heap_page_items(get_raw_page('mytable', 1787)) where lp = 42;
t_xmin | ?column?
-----------+----------------------------------
516651778 | 00000000000000000000000000000000
(1 row)

It seems like replica did not replayed corresponding WAL records.
Any thoughts?

Regards,
Dmitriy Sarafannikov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-05-24 12:11:16 Re: Proposal : For Auto-Prewarm.
Previous Message Robert Haas 2017-05-24 10:57:08 Re: [HACKERS] Concurrent ALTER SEQUENCE RESTART Regression