Re: SerializeParamList vs machines with strict alignment

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, hlinnaka <hlinnaka(at)iki(dot)fi>
Subject: Re: SerializeParamList vs machines with strict alignment
Date: 2018-09-10 06:52:34
Message-ID: CAA4eK1JfEc=cqiUiRvUcHYCf=PVEwM_bZ_QiOMHpqGpUMdY8gA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 10, 2018 at 8:58 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> I wondered why buildfarm member chipmunk has been failing hard
> for the last little while. Fortunately, it's supplying us with
> a handy backtrace:
>
> Program terminated with signal 7, Bus error.
> #0 EA_flatten_into (allocated_size=<optimized out>, result=0xb55ff30e, eohptr=0x188f440) at array_expanded.c:329
> 329 aresult->dataoffset = dataoffset;
> #0 EA_flatten_into (allocated_size=<optimized out>, result=0xb55ff30e, eohptr=0x188f440) at array_expanded.c:329
> #1 EA_flatten_into (eohptr=0x188f440, result=0xb55ff30e, allocated_size=<optimized out>) at array_expanded.c:293
> #2 0x003c3dfc in EOH_flatten_into (eohptr=<optimized out>, result=<optimized out>, allocated_size=<optimized out>) at expandeddatum.c:84
> #3 0x003c076c in datumSerialize (value=3934060, isnull=<optimized out>, typByVal=<optimized out>, typLen=<optimized out>, start_address=0xbea3bd54) at datum.c:341
> #4 0x002a8510 in SerializeParamList (paramLI=0x1889f18, start_address=0xbea3bd54) at params.c:195
> #5 0x002342cc in ExecInitParallelPlan (planstate=0xffffffff, estate=0x18863e0, sendParams=0x46e, nworkers=1, tuples_needed=-1) at execParallel.c:700
> #6 0x002461dc in ExecGather (pstate=0x18864f0) at nodeGather.c:151
> #7 0x00236b20 in ExecProcNodeFirst (node=0x18864f0) at execProcnode.c:445
> #8 0x0022fc2c in ExecProcNode (node=0x18864f0) at ../../../src/include/executor/executor.h:237
> #9 ExecutePlan (execute_once=<optimized out>, dest=0x188a108, direction=<optimized out>, numberTuples=0, sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x18864f0, estate=0x18863e0) at execMain.c:1721
> #10 standard_ExecutorRun (queryDesc=0x188a138, direction=<optimized out>, count=0, execute_once=true) at execMain.c:362
> #11 0x0023d630 in postquel_getnext (fcache=0x1888408, es=0x1889d68) at functions.c:867
> #12 fmgr_sql (fcinfo=0x701c7c) at functions.c:1164
>
> This is remarkably hard to replicate on other machines, but I eventually
> managed to duplicate it on gaur's host, after which it became really
> obvious that the parallel-query data transfer logic has never been
> stressed very hard on machines with strict data alignment rules.
>
> In particular, SerializeParamList does this:
>
> /* Write flags. */
> memcpy(*start_address, &prm->pflags, sizeof(uint16));
> *start_address += sizeof(uint16);
>
> immediately followed by this:
>
> datumSerialize(prm->value, prm->isnull, typByVal, typLen,
> start_address);
>
> and datumSerialize might do this:
>
> EOH_flatten_into(eoh, (void *) *start_address, header);
>
> Now, I will plead mea culpa that the expanded-object API doesn't
> say in large red letters that the target address for EOH_flatten_into
> is supposed to be maxaligned. It only says
>
> * The flattened representation must be a valid in-line, non-compressed,
> * 4-byte-header varlena object.
>
> Still, one might reasonably suspect from that that *at least* 4-byte
> alignment is expected.
>

datumSerialize does this:

memcpy(*start_address, &header, sizeof(int));
*start_address += sizeof(int);

before calling EOH_flatten_into, so it seems to me it should be 4-byte aligned.

> This code path isn't providing such alignment,
> and machines that require it will crash.

Yeah, I think as suggested by you, start_address should be maxaligned.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2018-09-10 07:16:20 Re: CREATE ROUTINE MAPPING
Previous Message Michael Paquier 2018-09-10 06:44:54 Re: stat() on Windows might cause error if target file is larger than 4GB