Re: Making type Datum be 8 bytes everywhere

From: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Joe Conway <mail(at)joeconway(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Making type Datum be 8 bytes everywhere
Date: 2025-09-11 12:43:35
Message-ID: CAEudQAqD2-Bva+ATua0q+Y_BBa64taiMLcOAKutrd6EsVPNxKQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Em qua., 10 de set. de 2025 às 17:35, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> escreveu:

> Tomas Vondra <tomas(at)vondra(dot)me> writes:
> > While testing a different patch, I tried running with address sanitizer
> > on rpi5, running the 32-bit OS (which AFAIK is 64-bit kernel and 32-bit
> > user space). With that, stats_ext regression tests fail like this:
>
> > extended_stats.c:1082:27: runtime error: store to misaligned address
> > 0x036671dc for type 'Datum', which requires 8 byte alignment
> > 0x036671dc: note: pointer points here
> > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 7e
> > 7f 08 00 00 00 7f 7f 7f 7f
> > ^
>
> > This happens because build_sorted_items() does palloc(), and then
> > accesses the pointer as array of structs, with a Datum field. And it
> > apparently expects the pointer to be a multiple of 8 bytes. Isn't that a
> > bit strange, with 32-bit user space? The pointer is indeed a multiple of
> > 4B, so maybe the expected alignment is wrong?
>
> I think build_sorted_items is plainly at fault here, where it does
>
> /* Compute the total amount of memory we need (both items and values).
> */
> len = data->numrows * sizeof(SortItem) + nvalues * (sizeof(Datum) +
> sizeof(bool));
>
> /* Allocate the memory and split it into the pieces. */
> ptr = palloc0(len);
>
> /* items to sort */
> items = (SortItem *) ptr;
> ptr += data->numrows * sizeof(SortItem);
>
> /* values and null flags */
> values = (Datum *) ptr;
> ptr += nvalues * sizeof(Datum);
>
> This is silently assuming that sizeof(SortItem) is a multiple of
> alignof(Datum), which on a 32-bit-pointer platform is not true
> any longer. We ought to MAXALIGN the two occurrences of
> data->numrows * sizeof(SortItem).
>
We possibly have two more instances?

1. Function ndistinct_for_combination (src/backend/statistics/mvdistinct.c)
- items = (SortItem *) palloc(numrows * sizeof(SortItem));
+ items = (SortItem *) palloc(MAXALIGN(numrows * sizeof(SortItem)));

2. Function build_distinct_groups (src/backend/statistics/mcv.c)
- SortItem *groups = (SortItem *) palloc(ngroups * sizeof(SortItem));
+ SortItem *groups = (SortItem *) palloc(MAXALIGN(ngroups *
sizeof(SortItem)));

best regards,
Ranier Vilela

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2025-09-11 12:55:08 Re: pgsql: Preserve conflict-relevant data during logical replication.
Previous Message Christoph Berg 2025-09-11 12:28:06 Re: A failure in 031_recovery_conflict.pl on Debian/s390x