Re: [BUGS] Breakage with VACUUM ANALYSE + partitions

From: Andres Freund <andres(at)anarazel(dot)de>
To: Abhijit Menon-Sen <ams(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] Breakage with VACUUM ANALYSE + partitions
Date: 2016-04-22 17:07:36
Message-ID: 20160422170736.bjrsusb4ubdhdtl4@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On 2016-04-14 11:09:29 +0530, Abhijit Menon-Sen wrote:
> At 2016-04-12 09:00:57 -0400, robertmhaas(at)gmail(dot)com wrote:
> >
> > On Mon, Apr 11, 2016 at 1:17 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > >
> > > 3) Actually handle the case of the last open segment not being
> > > RELSEG_SIZE properly in _mdfd_getseg() - mdnblocks() does so.
> >
> > #3 seems like it's probably about 15 years overdue, so let's do that
> > anyway.
>
> I don't have anything useful to contribute, I'm just trying to
> understand this fix.
>
> _mdfd_getseg() looks like this:
>
> targetseg = blkno / ((BlockNumber) RELSEG_SIZE);
> for (nextsegno = 1; nextsegno <= targetseg; nextsegno++)
> {
> Assert(nextsegno == v->mdfd_segno + 1);
>
> if (v->mdfd_chain == NULL)
> {
> /*
> * Normally we will create new segments only if authorized by the
> * caller (i.e., we are doing mdextend()). […]
> */
> if (behavior == EXTENSION_CREATE || InRecovery)
> {
> if (_mdnblocks(reln, forknum, v) < RELSEG_SIZE)
> /* mdextend */
> v->mdfd_chain = _mdfd_openseg(reln, forknum, +nextsegno, O_CREAT);
> }
> else
> {
> /* We won't create segment if not existent */
> v->mdfd_chain = _mdfd_openseg(reln, forknum, nextsegno, 0);
> }
> if (v->mdfd_chain == NULL)
> /* return NULL or ERROR */
> }
> v = v->mdfd_chain;
> }
>
> Do I understand correctly that the case of the "last open segment"
> (i.e., the one for which mdfd_chain == NULL) not being RELSEG_SIZE
> (i.e., _mdnblocks(reln, forknum, v) < RELSEG_SIZE) should not call
> _mfdf_openseg on nextsegno if behaviour is not EXTENSION_CREATE or
> InRecovery?
>
> And that "We won't create segment if not existent" should happen, but
> doesn't only because the next segment file wasn't removed earlier, so
> we have to add an extra check for that case?
>
> In other words, is something like the following what's needed here, or
> is there more to it?

It's a bit more complicated than that, but that's the principle. Thanks
for the patch, and sorry for my late response. See my attached version
for a more fleshed out version of this.

Looking at the code around this I found:
* _mdfd_getseg(), in contrast to mdnblocks() doesn't check segment size,
neither whether to continue with a too small, nor to error out with a
too big segment
* For, imo pretty bad, reasons in mdsync() we currently rely on
EXTENSION_RETURN_NULL to extend the file in recovery, and to set errno
to ENOENT. Brrr.
* we currently FATAL if a segment is too big - I've copied that
behaviour, but why not just ERROR out?

The attached patch basically adds the segment size checks to
_mdfd_getseg(), and doesn't perform extension, even in recovery, if
EXTENSION_REALLY_RETURN_NULL is passed.

This fixes the issue for me, both in the originally reported variant,
and in some more rudely setup version (i.e. rm'ing files).

I'm seing some weird behaviour (even with writeback flushing disabled)
with smgr invals and recovery/standbys, but I don't want to dive into it
before http://www.postgresql.org/message-id/CAD21AoDpZ6Xjg=gFrGPnSn4oTRRcwK1EBrWCq9OqOHuAcMMC=w@mail.gmail.com
is addressed, it's too likely to be related.

Greetings,

Andres Freund

Attachment Content-Type Size
0001-Don-t-open-formally-non-existant-segments-in-_mdfd_g.patch text/x-patch 5.6 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Mathias Kunter 2016-04-22 20:59:49 Re: BUG #14107: Major query planner bug regarding subqueries and indices
Previous Message Victor Yegorov 2016-04-22 16:57:02 Re: BUG #14107: Major query planner bug regarding subqueries and indices

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2016-04-22 18:20:52 Re: GIN data corruption bug(s) in 9.6devel
Previous Message Andres Freund 2016-04-22 16:36:14 Re: kqueue