Re: Hybrid Hash/Nested Loop joins and caching results from subplans

From: Andres Freund <andres(at)anarazel(dot)de>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Date: 2020-08-05 17:44:31
Message-ID: 20200805174431.sdfyaw4cuoyvngem@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-08-04 10:05:25 +1200, David Rowley wrote:
> I'd like to push the 0002 patch quite soon as I think it's an
> improvement to simplehash.h regardless of if we get Result Cache. It
> reuses the SH_LOOKUP function for deletes. Also, if we ever get around
> to giving up performing a lookup if we get too far away from the
> optimal bucket, then that would only need to appear in one location
> rather than in two.

> Andres, or anyone, any objections to me pushing 0002?

I think it'd be good to add a warning that, unless one is very careful,
no other hashtable modifications are allowed between lookup and
modification. E.g. something like
a = foobar_lookup();foobar_insert();foobar_delete();
will occasionally go wrong...

> - /* TODO: return false; if distance too big */
> +/*
> + * Perform hash table lookup on 'key', delete the entry belonging to it and
> + * return true. Returns false if no item could be found relating to 'key'.
> + */
> +SH_SCOPE bool
> +SH_DELETE(SH_TYPE * tb, SH_KEY_TYPE key)
> +{
> + SH_ELEMENT_TYPE *entry = SH_LOOKUP(tb, key);
>
> - curelem = SH_NEXT(tb, curelem, startelem);
> + if (likely(entry != NULL))
> + {
> + /*
> + * Perform deletion and also the relocation of subsequent items which
> + * are not in their optimal position but can now be moved up.
> + */
> + SH_DELETE_ITEM(tb, entry);
> + return true;
> }
> +
> + return false; /* Can't find 'key' */
> }

You meantioned on IM that there's a slowdowns with gcc. I wonder if this
could partially be responsible. Does SH_DELETE inline LOOKUP and
DELETE_ITEM? And does the generated code end up reloading entry-> or
tb-> members?

When the SH_SCOPE isn't static *, then IIRC gcc on unixes can't rely on
the called function actually being the function defined in the same
translation unit (unless -fno-semantic-interposition is specified).

Hm, but you said that this happens in tidbitmap.c, and there all
referenced functions are local statics. So that's not quite the
explanation I was thinking it was...

Hm. Also wonder whether we currently (i.e. the existing code) we
unnecessarily end up reloading tb->data a bunch of times, because we do
the access to ->data as
SH_ELEMENT_TYPE *entry = &tb->data[curelem];

Think we should instead store tb->data in a local variable.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-08-05 17:53:03 Re: FailedAssertion("pd_idx == pinfo->nparts", File: "execPartition.c", Line: 1689)
Previous Message Tom Lane 2020-08-05 17:30:30 Re: FailedAssertion("pd_idx == pinfo->nparts", File: "execPartition.c", Line: 1689)