Re: check for null value before looking up the hash function

From: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: check for null value before looking up the hash function
Date: 2022-05-21 17:03:47
Message-ID: CAEudQAru+FgryZ05Zes0YpLAC-4Yt4W8C-vGqf2C=Tjmu8G_oA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Em sáb., 21 de mai. de 2022 às 13:13, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> escreveu:

> Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> writes:
> > Em sáb., 21 de mai. de 2022 às 12:05, Tomas Vondra <
> > tomas(dot)vondra(at)enterprisedb(dot)com> escreveu:
> >> That's a quite bold claim, and yet you haven't supported it by any
> >> argument whatsoever. Trade-offs between complexity and efficiency are a
> >> crucial development task, so complicating the logic clearly does matter.
>
> > What I meant is that complicating the logic in search of efficiency is
> > worth it, and that's what everyone is looking for in this thread.
> > Likewise, not complicating the logic, losing a little bit of efficiency,
> > applied to all the code, leads to a big loss of efficiency.
> > In other words, I never miss an opportunity to gain efficiency.
>
> [ shrug... ] You quietly ignored Tomas' main point, which is that no
> evidence has been provided that there's actually any efficiency gain.
>
IMHO, the point here, is for-non-commiters everything needs benchmarks.
But, I have been see many commits withouts benchmarks or any evidence gains.
And of course, having benchmarks is better, but for micro-optimizations,
It doesn't seem like it's needed that much.

> (1) Sure, in the case where only null values are encountered during a
> query, we can save a cache lookup, but will that be even micro-measurable
> compared to general query overhead? Seems unlikely, especially if this is
> changed in only one place. That ties into my complaint about how this is
> just one instance of a fairly widely used coding pattern.
>
Of course, changing only in one place, the gain is tiny, but, step by step,
the coding pattern is changing too, becoming new "fairly widely".

>
> (2) What are the effects when we *do* eventually encounter a non-null
> value? The existing coding will perform all the necessary lookups
> at first call, but with the proposed change those may be spread across
> query execution. It's not implausible that that leads to a net loss
> of efficiency, due to locality-of-access effects.
>
Weel the current code, test branch for nulls first.
Most of the time, this is not true.
So, the current code has poor branch prediction, at least.
What I proposed, improves the branch prediction, at least.

> I'm also concerned that this increases the size of the "state space"
> of this function, in that there are now more possible states of its
> cached information. While that probably doesn't add any new bugs,
> it does add complication and make things harder to reason about.
> So the bottom line remains that I don't think it's worth changing.
>
Of course, your arguments are all valids.
That would all be clarified with benchmarks, maybe the OP is interested in
doing them.

regards,
Ranier Vilela

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2022-05-21 17:24:46 Re: check for null value before looking up the hash function
Previous Message Tom Lane 2022-05-21 16:13:24 Re: check for null value before looking up the hash function