Re: Breakage with VACUUM ANALYSE + partitions

From: Thom Brown <thom(at)linux(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, coelho(at)cri(dot)ensmp(dot)fr
Subject: Re: Breakage with VACUUM ANALYSE + partitions
Date: 2016-03-25 13:25:41
Message-ID: CAA-aLv71MvgTBOPDyxCR3hoLmop-Hu373c9dwfymEL5BPnpv_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On 25 March 2016 at 12:41, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
> On Thu, Mar 24, 2016 at 9:40 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Thu, Mar 24, 2016 at 12:59 AM, Haribabu Kommi
>> <kommi(dot)haribabu(at)gmail(dot)com> wrote:
>>> So further operations on the table uses the already constructed smgr relation
>>> and treats that there are RELSEG_SIZE number of blocks in the page and try
>>> to do the scan. But there are 0 pages in the table thus it produces the error.
>>>
>>> The issue doesn't occur from another session. Because of this reason only
>>> if we do only vacuum operation, the error not occurred.
>>
>> Yeah, I had a suspicion that this might have to do with invalidation
>> messages based on Thom's description, but I think we still need to
>> track down which commit is at fault.
>
> I could reproduce the failure on Linux, not on OSX, and bisecting the
> failure, the first bad commit is this one:
> commit: 428b1d6b29ca599c5700d4bc4f4ce4c5880369bf
> author: Andres Freund <andres(at)anarazel(dot)de>
> date: Thu, 10 Mar 2016 17:04:34 -0800
> Allow to trigger kernel writeback after a configurable number of writes.
>
> The failure is a little bit sporadic, based on my tests 1/2 runs out
> of 10 could pass, so one good commit was recognized as such after
> passing the SQL sequence sent by Thom 5 times in a row. I also did
> some manual tests and those are pointing to this commit as well.

I've had to adapt a test with waits in to get it to work more
reliably. I've just scattered it with sleeps:

\pset pager off
select pg_sleep(1);
CREATE TABLE pgbench_accounts_1 (CHECK (bid % 2 = 0)) INHERITS
(pgbench_accounts);
select pg_sleep(1);
CREATE TABLE pgbench_accounts_2 (CHECK (bid % 2 = 1)) INHERITS
(pgbench_accounts);
select pg_sleep(1);
WITH del AS (DELETE FROM pgbench_accounts WHERE bid % 2 = 0 RETURNING
*) INSERT INTO pgbench_accounts_1 SELECT * FROM del;
select pg_sleep(1);
WITH del AS (DELETE FROM pgbench_accounts WHERE bid % 2 = 1 RETURNING
*) INSERT INTO pgbench_accounts_2 SELECT * FROM del;
select pg_sleep(1);
VACUUM ANALYSE;
select pg_sleep(3);
EXPLAIN ANALYSE SELECT count(*) FROM pgbench_accounts;

But this brings me to the same commit you found.

Thom

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message John McKown 2016-03-25 13:32:19 Re: BUG #14046: Bad mathematical rules for 0 cast
Previous Message Michael Paquier 2016-03-25 12:41:51 Re: Breakage with VACUUM ANALYSE + partitions

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2016-03-25 13:36:00 Re: NOT EXIST for PREPARE
Previous Message Michael Paquier 2016-03-25 13:15:27 Re: [PATCH v8] GSSAPI encryption support