Re: SerializeParamList vs machines with strict alignment

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, hlinnaka <hlinnaka(at)iki(dot)fi>
Subject: Re: SerializeParamList vs machines with strict alignment
Date: 2018-10-01 20:21:52
Message-ID: 27748.1538425312@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> On Sun, Sep 30, 2018 at 10:44 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> Attached is a patch along these lines, let me know if you have
>> something else in mind? I have manually verified this with
>> force_parallel_mode=regress configuration in my development
>> environment. I don't have access to alignment-sensitive hardware, so
>> can't test in such an environment. I will see if I can write a test
>> as well.

> Attached patch contains a test case as well to cover this code.

This test case doesn't actually trigger the problem. There are two
reasons: (1) it never invokes datumSerialize with an odd starting
address, and (2) the expanded array produced by the plpgsql function
contains a "flat" array, allowing EA_flatten_into to go through its
"easy" case, which just does a memcpy and so is not sensitive to
misalignment. To fix (1) we need a previous parameter of odd length,
and to fix (2) we need the plpgsql function to build the array from
parts. The cast to a domain is unnecessary and might indeed be a
third reason why it doesn't work --- I'd not be very surprised if
that results in array flattening.

The attached revised patch contains a test case that demonstrably triggers
the problem on gaur's host. Oddly, I do not get a crash either on a PPC
Mac or a Raspberry Pi 3 running Raspbian. I'm not very sure why; I traced
through things with gdb and it's definitely calling EA_flatten_into with
an odd address and a non-flattened input. I guess both of those platforms
have kernel handlers for misaligned accesses? But the Raspbian box ought
to be nearly the same as chipmunk, which is where we saw the problem to
begin with, so I'm a bit confused.

> Kindly suggest what
> is the best path forward, is it a good idea to commit this on HEAD
> first and see if the test passes on all machines in buildfarm and then
> back-patch it?

No, I think you should just commit and back-patch at the same time.
I don't have any doubt about the code patch being good, it's just
a question of whether the test case reliably triggers the problem.
And in the end that's not that important as long as it does so on
at least one buildfarm member.

regards, tom lane

Attachment Content-Type Size
fix_datum_serialization_v3.patch text/x-diff 4.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-10-01 20:27:19 Re: has_column_privilege behavior (was Re: Assert failed in snprintf.c)
Previous Message Peter Eisentraut 2018-10-01 20:21:31 settings to control SSL/TLS protocol version