Skip site navigation (1) Skip section navigation (2)

Hot standby and GiST page splits (was Re: WIP: Fast GiST index build)

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Hot standby and GiST page splits (was Re: WIP: Fast GiST index build)
Date: 2011-08-01 09:38:13
Message-ID: 4E367405.3000106@enterprisedb.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On 27.07.2011 17:43, Alexander Korotkov wrote:
> OK, thanks. I also found behaviour of GiST(without patch) with streaming
> replication that seems strange for me. On master there are only few
> rightlinks are InvalidBlockNumber while on slave there are a lot of them. I
> hack gevel for getting index structure on slave (accessing tree without
> AccessExclusiveLock).
>
> On master:
> # create table test as (select point(random(),random()) from
> generate_series(1,100000));
> # create index test_idx on test using gist(point);
> # \copy (select gist_tree('test_idx')) to 'tree1r.txt';
>
> On slave:
> # \copy (select gist_tree('test_idx')) to 'tree2r.txt';
>
> In bash:
> # cat tree1r.txt | sed 's/\\n/\n/g'>  tree1.txt
> # cat tree2r.txt | sed 's/\\n/\n/g'>  tree2.txt
> # diff tree1.txt tree2.txt
>
> 2,89c2,89
> <      1(l:1) blk: 324 numTuple: 129 free: 2472b(69.71%) rightlink:637 (OK)
> <          1(l:2) blk: 242 numTuple: 164 free: 932b(88.58%) rightlink:319
> (OK)
> <          2(l:2) blk: 525 numTuple: 121 free: 2824b(65.39%) rightlink:153
> (OK)
> <          3(l:2) blk: 70 numTuple: 104 free: 3572b(56.23%) rightlink:551
> (OK)
> <          4(l:2) blk: 384 numTuple: 106 free: 3484b(57.30%) rightlink:555
> (OK)
> <          5(l:2) blk: 555 numTuple: 121 free: 2824b(65.39%) rightlink:74
> (OK)
> <          6(l:2) blk: 564 numTuple: 109 free: 3352b(58.92%) rightlink:294
> (OK)
> <          7(l:2) blk: 165 numTuple: 108 free: 3396b(58.38%) rightlink:567
> (OK)
> .....
> ---
>>      1(l:1) blk: 324 numTuple: 129 free: 2472b(69.71%) rightlink:4294967295
> (InvalidBlockNumber)
>>          1(l:2) blk: 242 numTuple: 164 free: 932b(88.58%)
> rightlink:4294967295 (InvalidBlockNumber)
>>          2(l:2) blk: 525 numTuple: 121 free: 2824b(65.39%)
> rightlink:4294967295 (InvalidBlockNumber)
>>          3(l:2) blk: 70 numTuple: 104 free: 3572b(56.23%)
> rightlink:4294967295 (InvalidBlockNumber)
>>          4(l:2) blk: 384 numTuple: 106 free: 3484b(57.30%)
> rightlink:4294967295 (InvalidBlockNumber)
>>          5(l:2) blk: 555 numTuple: 121 free: 2824b(65.39%)
> rightlink:4294967295 (InvalidBlockNumber)
>>          6(l:2) blk: 564 numTuple: 109 free: 3352b(58.92%)
> rightlink:4294967295 (InvalidBlockNumber)
>>          7(l:2) blk: 165 numTuple: 108 free: 3396b(58.38%)
> rightlink:4294967295 (InvalidBlockNumber)
> .....
>
> Isn't it a bug?

Yeah, it sure looks like a bug. I was certain that I had broken this in 
the recent changes to GiST handling of page splits, but in fact it has 
been like that forever.

The rightlinks are not needed after crash recovery, because all the 
downlinks should be there. A scan will find all pages through the 
downlinks, and doesn't need to follow any rightlinks. I'm not sure why 
we explicitly clear them, it's not like the rightlinks would do any harm 
either, but for crash recovery that's harmless.

But a scan during hot standby can see those intermediate states, just 
like concurrent scans can on the master. The locking on replay of page 
split needs to be fixed, too. At the moment, it locks and writes out 
each page separately, so a concurrent scan could "overtake" the WAL 
replay while following rightlinks, and miss tuples on the right half.

Attached is a patch for that for 9.1/master. The 9.0 GiST replay code 
was quite different, it will require a separate patch.

-- 
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Attachment: fix-gist-hot-standby-1.patch
Description: text/x-diff (2.3 KB)

In response to

Responses

pgsql-hackers by date

Next:From: Simon RiggsDate: 2011-08-01 10:13:54
Subject: Re: Hot standby and GiST page splits (was Re: WIP: Fast GiST index build)
Previous:From: Simon RiggsDate: 2011-08-01 09:24:43
Subject: Re: One-Shot Plans

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group