Re: Speed up Clog Access by increasing CLOG buffers

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Date: 2017-03-10 02:35:50
Message-ID: CA+TgmobrMF8ALx_7pGM+4G=i-o3NBf+FrB4bh6XHqUF7NuVgDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 9, 2017 at 9:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I think eight is enough. Committed with some cosmetic changes.
>
> Buildfarm thinks eight wasn't enough.
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=clam&dt=2017-03-10%2002%3A00%3A01

At first I was confused how you knew that this was the fault of this
patch, but this seems like a pretty indicator:

TRAP: FailedAssertion("!(curval == 0 || (curval == 0x03 && status !=
0x00) || curval == status)", File: "clog.c", Line: 574)

I'm not sure whether it's related to this problem or not, but now that
I look at it, this (preexisting) comment looks like entirely wishful
thinking:

* If we update more than one xid on this page while it is being written
* out, we might find that some of the bits go to disk and others don't.
* If we are updating commits on the page with the top-level xid that
* could break atomicity, so we subcommit the subxids first before we mark
* the top-level commit.

The problem with that is the word "before". There are no memory
barriers here, so there's zero guarantee that other processes see the
writes in the order they're performed here. But it might be a stretch
to suppose that that would cause this symptom.

Maybe we should replace that Assert() with an elog() and dump out the
actual values.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2017-03-10 02:54:19 Re: on_dsm_detach() callback and parallel tuplesort BufFile resource management
Previous Message Amit Kapila 2017-03-10 02:34:48 Re: Write Ahead Logging for Hash Indexes