Re: SQL-level pg_datum_image_equal

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: SQL-level pg_datum_image_equal
Date: 2026-03-27 14:51:35
Message-ID: CAEze2WjF9M8TFG3q1z80bvN0_CijHSGF_NfEhvjqjbMnOOTJGg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 26 Mar 2026 at 18:17, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> writes:
> > On Wed, 25 Mar 2026 at 22:51, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> >> You lost me at this part. How does marking the function as STABLE
> >> prevent users from persisting things on disk based on the return value
> >> of the function? I expected the primary use case for this would be in
> >> trigger functions that make decisions about data that goes into
> >> tables.
>
> > Indexes and stored generated columns' expression may only contain
> > IMMUTABLE functions, so that they don't change output when the inputs
> > values are unchanged. As the current datum_image_equal depends on the
> > volatile contents/definition of sign-extended bytes (which we clearly
> > don't have a defined/expected value for) that makes the output of this
> > function not immutable for the "same" input values.
>
> This seems to me to be a rather creative misinterpretation of what
> STABLE and IMMUTABLE mean. If you want to claim that IMMUTABLE means
> that, then the function isn't STABLE either, since it could give
> different results for the "same" input values within one query.
> Moreover, switching from IMMUTABLE to STABLE wouldn't fix the
> problem of users assuming more than they should.

Yeah, that's fair.

> The actual problem here is that datum_image_eq is assuming more
> than it should about the contents of a pass-by-value Datum.
> That was okay for its original use-cases because a false not-equal
> report would just end in not applying some optimization. But
> Memoize thinks that the answers are exact, and users would too
> if we expose the function at SQL level.
>
> I think what David proposed at
> <CAApHDvreF-UiqBaHtRTQWQ6z1X9snstJW+dfb2DU5GOb-uPEBg(at)mail(dot)gmail(dot)com>
> is not a hack, but in fact correcting datum_image_eq/datum_image_hash
> to not assume that unspecified bits are reliably the same.

Agreed.

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Sami Imseih 2026-03-27 14:53:36 Re: Clean up NamedLWLockTranche stuff
Previous Message Peter Eisentraut 2026-03-27 14:29:00 Re: Thread-safe getopt()