Re: WIP: Avoid creation of the free space map for small tables

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
Cc: Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Avoid creation of the free space map for small tables
Date: 2019-02-02 02:00:18
Message-ID: CAA4eK1+MoiSq_ZPjHchhF7=GX36yoo+65ojGVFahCSnFPxFsyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 28, 2019 at 4:40 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Jan 28, 2019 at 10:03 AM John Naylor
> <john(dot)naylor(at)2ndquadrant(dot)com> wrote:
> >
> > On Mon, Jan 28, 2019 at 4:53 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > There are a few buildfarm failures due to this commit, see my email on
> > > pgsql-committers. If you have time, you can also once look into
> > > those.
> >
> > I didn't see anything in common with the configs of the failed
> > members. None have a non-default BLCKSZ that I can see.
> >
>
> I have done an analysis of the different failures on buildfarm.
>

In the past few days, we have done a further analysis of each problem
and tried to reproduce it. We are successful in generating some form
of reproducer for 3 out of 4 problems in the same way as it was failed
in the buildfarm. For the fourth symptom, we have tried a lot (even
Andrew Dunstan has helped us to run the regression tests with the
faulty commit on Jacana for many hours, but it didn't got reproduced)
but not able to regenerate a failure in a similar way. However, I
have a theory as mentioned below why the particular test could fail
and the fix for the same is done in the patch. I am planning to push
the latest version of the patch [1] which has fixes for all the
symptoms. Does anybody have any opinion here?

>
> 4. Failure on jacana:
> --- c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/test/regress/expected/box.out
> 2018-09-26
> 17:53:33 -0400
> +++ c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/src/test/regress/results/box.out
> 2019-01-27 23:14:35
> -0500
> @@ -252,332 +252,7 @@
> ('(0,100)(0,infinity)'),
> ('(-infinity,0)(0,infinity)'),
> ('(-infinity,-infinity)(infinity,infinity)');
> -SET enable_seqscan = false;
> -SELECT * FROM box_temp WHERE f1 << '(10,20),(30,40)';
> ..
> ..
> TRAP: FailedAssertion("!(!(fsm_local_map.nblocks > 0))", File:
> "c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.build/../pgsql/src/backend/storage/freespace/freespace.c",
> Line:
> 1118)
> ..
> 2019-01-27 23:14:35.495 EST [5c4e81a0.2e28:4] LOG: server process
> (PID 14388) exited with exit code 3
> 2019-01-27 23:14:35.495 EST [5c4e81a0.2e28:5] DETAIL: Failed process
> was running: INSERT INTO box_temp
> VALUES (NULL),
>
> I think the reason for this failure is same as previous (as mentioned
> in point-3), but this can happen in a different way. Say, we have
> searched the local map and then try to extend a relation 'X' and in
> the meantime, another backend has extended such that it creates FSM.
> Now, we will reuse that page and won't clear local map. Now, say we
> try to insert in relation 'Y' which doesn't have FSM. It will try to
> set the local map and will find that it already exists, so will fail.
> Now, the question is how it can happen in this box.sql test. I guess
> that is happening for some system table which is being populated by
> Create Index statement executed just before the failing Insert.
>
> I think both 3 and 4 are timing issues, so we didn't got in our local
> regression runs.
>

[1] - https://www.postgresql.org/message-id/CAA4eK1%2B3ajhRPC0jvUi6p_aMrTUpB568OBH10LrbHtvOLNTgqQ%40mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2019-02-02 02:27:51 Re: Non-deterministic IndexTuple toast compression from index_form_tuple() + amcheck false positives
Previous Message Michael Paquier 2019-02-02 01:35:20 Re: Connection slots reserved for replication