Re: shm_mq_set_sender() crash

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: shm_mq_set_sender() crash
Date: 2016-08-27 07:43:45
Message-ID: CAA4eK1L6ZEd+R9Wo5jLtX12ohy+FjHMMe7Y_Rr27E9mEgOht+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 26, 2016 at 6:20 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Latest from lorikeet:
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lorikeet&dt=2016-08-26%2008%3A37%3A27
>
> TRAP: FailedAssertion("!(vmq->mq_sender == ((void *)0))", File: "/home/andrew/bf64/root/REL9_6_STABLE/pgsql.build/../pgsql/src/backend/storage/ipc/shm_mq.c", Line: 220)
>

Do you think, it is due to some recent change or we are just seeing
now as it could be timing specific issue?

So here what seems to be happening is that during worker startup, we
are trying to set the sender for a shared memory queue and the same is
already set. Now, one theoretical possibility of the same could be
that the two workers get the same ParallelWorkerNumber which is then
used to access the shm queue (refer
ParallelWorkerMain/ExecParallelGetReceiver). We are setting the
ParallelWorkerNumber in below code which seems to be doing what it is
suppose to do:

LaunchParallelWorkers()
{
..
for (i = 0; i < pcxt->nworkers; ++i)
{
memcpy(worker.bgw_extra, &i, sizeof(int));
if (!any_registrations_failed &&
RegisterDynamicBackgroundWorker(&worker,
&pcxt->worker[i].bgwhandle))
..
}

Can some reordering impact the above code?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-08-27 08:33:14 Re: Renaming of pg_xlog and pg_clog
Previous Message Amit Kapila 2016-08-27 05:36:39 Re: WAL consistency check facility