Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock)

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock)
Date: 2012-02-15 16:01:37
Message-ID: 4F3BD6E1.40904@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 13.02.2012 19:13, Fujii Masao wrote:
> On Mon, Feb 13, 2012 at 8:37 PM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> On 13.02.2012 01:04, Jeff Janes wrote:
>>>
>>> Attached is my quick and dirty attempt to set XLP_FIRST_IS_CONTRECORD.
>>> I have no idea if I did it correctly, in particular if calling
>>> GetXLogBuffer(CurrPos) twice is OK or if GetXLogBuffer has side
>>> effects that make that a bad thing to do. I'm not proposing it as the
>>> real fix, I just wanted to get around this problem in order to do more
>>> testing.
>>
>>
>> Thanks. That's basically the right approach. Attached patch contains a
>> cleaned up version of that.
>>
>>
>>> It does get rid of the "there is no contrecord flag" errors, but
>>> recover still does not work.
>>>
>>> Now the count of tuples in the table is always correct (I never
>>> provoke a crash during the initial table load), but sometimes updates
>>> to those tuples that were reported to have been committed are lost.
>>>
>>> This is more subtle, it does not happen on every crash.
>>>
>>> It seems that when recovery ends on "record with zero length at...",
>>> that recovery is correct.
>>>
>>> But when it ends on "invalid magic number 0000 in log file.." then the
>>> recovery is screwed up.
>>
>>
>> Can you write a self-contained test case for that? I've been trying to
>> reproduce that by running the regression tests and pgbench with a streaming
>> replication standby, which should be pretty much the same as crash recovery.
>> No luck this far.
>
> Probably I could reproduce the same problem as Jeff got. Here is the test case:
>
> $ initdb -D data
> $ pg_ctl -D data start
> $ psql -c "create table t (i int); insert into t
> values(generate_series(1,10000)); delete from t"
> $ pg_ctl -D data stop -m i
> $ pg_ctl -D data start
>
> The crash recovery emitted the following server logs:
>
> LOG: database system was interrupted; last known up at 2012-02-14 02:07:01 JST
> LOG: database system was not properly shut down; automatic recovery in progress
> LOG: redo starts at 0/179CC90
> LOG: invalid magic number 0000 in log file 0, segment 1, offset 8060928
> LOG: redo done at 0/17AD858
> LOG: database system is ready to accept connections
> LOG: autovacuum launcher started
>
> After recovery, I could not see the table "t" which I created before:
>
> $ psql -c "select count(*) from t"
> ERROR: relation "t" does not exist

Are you still seeing this failure with the latest patch I posted
(http://archives.postgresql.org/message-id/4F38F5E5.8050203@enterprisedb.com)?
That includes Jeff's fix for the original crash you and Jeff saw. With
that version, I can't get a crash anymore. I also can't reproduce the
inconsistency that Jeff still saw with his fix
(http://archives.postgresql.org/message-id/CAMkU=1zGWp2QnTjiyFe0VMu4gc+MoEexXYaVC2u=+ORfiYj6ow@mail.gmail.com).
Jeff, can you clarify if you're still seeing an issue with the latest
version of the patch? If so, can you give a self-contained test case for
that?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2012-02-15 16:20:36 Re: pg_test_fsync performance
Previous Message Robert Haas 2012-02-15 15:58:51 Re: [v9.2] LEAKPROOF attribute of FUNCTION (Re: [v9.2] Fix Leaky View Problem)