Re: jsonb crash

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb crash
Date: 2021-11-11 05:08:29
Message-ID: CAApHDvqBKYdzXuYqi4d8+DqhrJppEa97O9SrHtOp7v29gtYkvA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 30 Sept 2021 at 10:54, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Actually, the more I look at this the more unhappy I get, because
> it's becoming clear that you have made unfounded semantic
> assumptions. The hash functions generally only promise that they
> will distinguish values that are distinguishable by the associated
> equality operator. We have plenty of data types in which that does
> not map to bitwise equality ... you need not look further than
> float8 for an example.

I think this part might be best solved by allowing Memoize to work in
a binary mode. We already have datum_image_eq() for performing a
binary comparison on a Datum. We'll also need to supplement that with
a function that generates a hash value based on the binary value too.

If we do that and put Memoize in binary mode when join operators are
not hashable or when we're doing LATERAL joins, I think it should fix
this.

It might be possible to work a bit harder and allow the logical mode
for some LATERAL joins. e.g. something like: SELECT * FROM a, LATERAL
(SELECT * FROM b WHERE a.a = b.b LIMIT 1) b; could use the logical
mode (assuming the join operator is hashable), however, we really only
know the lateral_vars. We don't really collect their context
currently, or the full Expr that they're contained in. That case gets
more complex if the join condition had contained a mix of lateral and
non-lateral vars on one side of the qual, e.g WHERE a.a = b.b + a.z.

Certainly if the lateral part of the query was a function call, then
we'd be forced into binary mode as we'd have no idea what the function
is doing with the lateral vars being passed to it.

I've my proposed patch.

An easier way out of this would be to disable Memoize for lateral
joins completely and only allow it for normal joins when the join
operators are hashable. I don't want to do this as people are already
seeing good wins in PG14 with Memoize and lateral joins [1]. I think
quite a few people would be upset if we removed that ability.

David

[1] https://twitter.com/RPorsager/status/1455660236375826436

Attachment Content-Type Size
memoize_binary_cache.patch application/octet-stream 31.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-11-11 05:09:42 Re: Parallel vacuum workers prevent the oldest xmin from advancing
Previous Message Dian M Fay 2021-11-11 04:58:44 Re: [PATCH] postgres_fdw: suppress explicit casts in text:text comparisons (was: column option to override foreign types)