Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock)

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock)
Date: 2012-02-09 11:02:04
Message-ID: CAHGQGwEv2qpCoX0jtV8SFDXVDp9pQtLuD2WZCJ3di5difGQjxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 9, 2012 at 7:25 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Thu, Feb 9, 2012 at 3:32 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>> On Wed, Feb 1, 2012 at 11:46 PM, Heikki Linnakangas
>> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> On 31.01.2012 17:35, Fujii Masao wrote:
>>>>
>>>> On Fri, Jan 20, 2012 at 11:11 PM, Heikki Linnakangas
>>>> <heikki(dot)linnakangas(at)enterprisedb(dot)com>  wrote:
>>>>>
>>>>> On 20.01.2012 15:32, Robert Haas wrote:
>>>>>>
>>>>>>
>>>>>> On Sat, Jan 14, 2012 at 9:32 AM, Heikki Linnakangas
>>>>>> <heikki(dot)linnakangas(at)enterprisedb(dot)com>    wrote:
>>>>>>>
>>>>>>>
>>>>>>> Here's another version of the patch to make XLogInsert less of a
>>>>>>> bottleneck
>>>>>>> on multi-CPU systems. The basic idea is the same as before, but several
>>>>>>> bugs
>>>>>>> have been fixed, and lots of misc. clean up has been done.
>>>>>>
>>>>>>
>>>>>>
>>>>>> This seems to need a rebase.
>>>>>
>>>>>
>>>>>
>>>>> Here you go.
>>>>
>>>>
>>>> The patch seems to need a rebase again.
>>>
>>>
>>> Here you go again. It conflicted with the group commit patch, and the patch
>>> to WAL-log and track changes to full_page_writes setting.
>>
>>
>> After applying this patch and then forcing crashes, upon recovery the
>> database is not correct.
>>
>> If I make a table with 10,000 rows and then after that intensively
>> update it using a unique key:
>>
>> update foo set count=count+1 where foobar=?
>>
>> Then after the crash there are less than 10,000 visible rows:
>>
>> select count(*) from foo
>>
>> This not a subtle thing, it happens every time.  I get counts of
>> between 1973 and 8827.  Without this patch I always get exactly
>> 10,000.
>>
>> I don't really know where to start on tracking this down.
>
> Similar problem happened on my test. When I executed CREATE TABLE and
> shut down the server with immediate mode, after recovery I could not see the
> created table. Here are the server log of recovery with wal_debug = on:
>
> LOG:  database system was interrupted; last known up at 2012-02-09 19:18:50 JST
> LOG:  database system was not properly shut down; automatic recovery in progress
> LOG:  redo starts at 0/179CC90
> LOG:  REDO @ 0/179CC90; LSN 0/179CCB8: prev 0/179CC30; xid 0; len 4:
> XLOG - nextOid: 24576
> LOG:  REDO @ 0/179CCB8; LSN 0/179CCE8: prev 0/179CC90; xid 0; len 16:
> Storage - file create: base/12277/16384
> LOG:  REDO @ 0/179CCE8; LSN 0/179DDE0: prev 0/179CCB8; xid 998; len
> 21; bkpb1: Heap - insert: rel 1663/12277/12014; tid 7/22
> LOG:  there is no contrecord flag in log file 0, segment 1, offset 7987200
> LOG:  redo done at 0/179CCE8
>
> According to the log "there is no contrecord flag", ISTM the path treats the
> contrecord of backup block incorrectly, and which causes the problem.

Yep, as far as I read the patch, it seems to have forgotten to set
XLP_FIRST_IS_CONTRECORD flag.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2012-02-09 12:18:29 Re: Add protransform for numeric, varbit, and temporal types
Previous Message Fujii Masao 2012-02-09 10:25:29 Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock)