Quick Links

Re: SQL-level pg_datum_image_equal

From:	Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To:	jian he <jian(dot)universality(at)gmail(dot)com>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: SQL-level pg_datum_image_equal
Date:	2025-12-22 15:25:43
Message-ID:	CAEze2Wg4Fj+9LQOv=J5cqwapD8vz3oDh1B0iyX7RwesvseH9gg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sat, 20 Dec 2025 at 14:15, jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>
> On Thu, Dec 11, 2025 at 1:46 AM Matthias van de Meent
> <boekewurm+postgres(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
>
> >
>
> > So, attached is a simple and to-the-point patch that adds the function
> > mentioned in $subject, which will tell the user whether two values of
> > the same type have an exactly equal binary representation, using
> > datum_image_eq.
> >
>
> hi.
>
> maybe Table 9.76
> (https://www.postgresql.org/docs/current/functions-info.html#FUNCTIONS-INFO-CATALOG)
> is the right place for this function.

I think table 9.3
(https://www.postgresql.org/docs/18/functions-comparison.html#FUNCTIONS-COMPARISON-FUNC-TABLE)
makes more sense, as this is more a compare function than one that
exposes catalog information about the input.

> corner case confused me, I think this is related to null handling,
> maybe not related to this.
> create type t1 as (a int, b text);
> select pg_datum_image_equal('(,)'::t1, $$(,)$$::t1);
> select pg_datum_image_equal('(,)'::t1, NULL::t1);
> select '(,)'::t1 is null, NULL::t1 is null;

Yes, that's row-type NULL handling for you. '(,)' is a composite value
with only NULL values in the attributes, and SQL defines that rows
with only NULL columns must return True when `IS NULL` evaluates their
NULL-ness. On disk, however, it is still stored as a "composite type;
attributes 'a' and 'b' are NULL"; so that a user that casts the value
to text will get a different result between (NULL::t1::text) and
('(,)'::t1::text), allowing safe round-trip conversions. Also note
that `('(,)'::t1 IS DISTINCT FROM NULL::t1) = TRUE, another curious
consequence of this SQL rule.

So, that output is expected; some methods already expose these
differences between the values, so pg_datum_image_equal() *must* also
indicate they are different. And now we also have one more reason to
have a function that can notice distinctions that go deeper than
surface-level SQL.

Aside: This new function doesn't actually fully cover the spectrum of
possible inequalities detectable through SQL, as there are some very
low level datum introspection tools like pg_column_size() whose output
depends on the type of toasting applied. My function cover that,
because that data should be completely irrelevant to normal data
usage, and the user can combine this manually if they really need it.

> enforce_generic_type_consistency already resolved generic type.

While you are correct to point out that the type system would prevent
this from getting called from SQL without a proper type, I'd like to
keep the check to make sure that callers from outside the type system
don't accidentally fail to provide the function with a correct type.

> so
> + if (!OidIsValid(typ))
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("could not determine type")));
> + }
> this part should be elog(ERROR.....) ?

Is there a policy on what should _not_ use ereport? I know we don't
require ereport for internal errors, but is considered forbidden?

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

In response to

Re: SQL-level pg_datum_image_equal at 2025-12-20 13:14:46 from jian he

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Yugo Nagata	2025-12-22 15:37:05	Re: psql: tab-completion support for COPY ... TO/FROM STDIN, STDOUT, and PROGRAM
Previous Message	Nikita Malakhov	2025-12-22 15:11:20	Re: SQL/JSON json_table plan clause