Re: SerializeParamList vs machines with strict alignment

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, hlinnaka <hlinnaka(at)iki(dot)fi>
Subject: Re: SerializeParamList vs machines with strict alignment
Date: 2018-09-10 13:35:00
Message-ID: 8352.1536586500@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> On Mon, Sep 10, 2018 at 8:58 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> In particular, SerializeParamList does this:
>>
>> /* Write flags. */
>> memcpy(*start_address, &prm->pflags, sizeof(uint16));
>> *start_address += sizeof(uint16);
>>
>> immediately followed by this:
>>
>> datumSerialize(prm->value, prm->isnull, typByVal, typLen,
>> start_address);

> datumSerialize does this:

> memcpy(*start_address, &header, sizeof(int));
> *start_address += sizeof(int);

> before calling EOH_flatten_into, so it seems to me it should be 4-byte aligned.

But that doesn't undo the fact that you're now on an odd halfword
boundary. In the case I observed, EA_flatten_into was trying to
store an int32 through a pointer whose hex value ended in E, which
is explained by the "+= sizeof(uint16)".

> Yeah, I think as suggested by you, start_address should be maxaligned.

A localized fix would be for datumSerialize to temporarily palloc the
space for EOH_flatten_into to write into, and then it could memcpy
that to whatever random address it'd been passed. Seems kind of
inefficient though, especially since you'd likely have to do the same
thing on the receiving side.

A larger issue is that even on machines where misalignment is permitted,
memcpy'ing to/from misaligned addresses is rather inefficient.

I think it'd be better to redesign the whole thing so that every field
is maxaligned, and you can e.g. just store and retrieve integer fields
as integers without a memcpy. The "wasted" padding space doesn't seem
very significant given the short-lived nature of the serialized data.

However, that's not exactly a trivial change. Probably the best way
forward is to adopt the extra-palloc solution as a back-patchable
fix, and then work on a more performant solution in HEAD only.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-09-10 13:55:04 Re: Latest HEAD fails to build
Previous Message Adrien NAYRAT 2018-09-10 13:17:51 Re: Standby reads fail when autovacuum take AEL during truncation