Re: [dynahash] do not refill the hashkey after hash_search

From: Junwang Zhao <zhjwpku(at)gmail(dot)com>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [dynahash] do not refill the hashkey after hash_search
Date: 2023-09-13 08:46:31
Message-ID: CAEG8a3KEO_Kdt2Y5hFNWMEX3DpCXi9jtZOJY-GFUEE9QLgF+bw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 13, 2023 at 4:22 PM John Naylor
<john(dot)naylor(at)enterprisedb(dot)com> wrote:
>
>
> On Wed, Sep 13, 2023 at 1:47 PM Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:
> >
> > Hi hackers,
> >
> > It's not necessary to fill the key field for most cases, since
> > hash_search has already done that for you. For developer that
> > using memset to zero the entry structure after enter it, fill the
> > key field is a must, but IMHO that is not good coding style, we
> > really should not touch the key field after insert it into the
> > dynahash.
>
> - memset(part_entry, 0, sizeof(LogicalRepPartMapEntry));
> - part_entry->partoid = partOid;
> + Assert(part_entry->partoid == partOid);
> + memset(entry, 0, sizeof(LogicalRepRelMapEntry));
>
> This is making an assumption that the non-key part of LogicalRepPartMapEntry will never get new members. Without knowing much about this code, it seems like a risk in the abstract.

What do you mean by 'the non-key part of LogicalRepPartMapEntry will
never get new members'?

typedef struct LogicalRepPartMapEntry
{
Oid partoid; /* LogicalRepPartMap's key */
LogicalRepRelMapEntry relmapentry;
} LogicalRepPartMapEntry;

partoid has already been filled by hash_search with HASH_ENTER action,
so I think the
above code should have the same effects.

>
> > This patch fixed some most abnormal ones, instead of refilling the
> > key field of primitive types, adding some assert might be a better
> > choice.
>
> Taking a quick look, I didn't happen to see any existing asserts of this sort, so the patch doesn't seem to be making things more "normal". I did see a few instances of /* hash_search already filled in the key */, so if we do anything at all here, we might prefer that.

There are some code using assert for this sort, for example in
*ReorderBufferToastAppendChunk*:

```
ent = (ReorderBufferToastEnt *)
hash_search(txn->toast_hash, &chunk_id, HASH_ENTER, &found);

if (!found)
{
Assert(ent->chunk_id == chunk_id); <------- this
line, by Robert Haas
ent->num_chunks = 0;
ent->last_chunk_seq = 0;
```

and in *rebuild_database_list*, tom commented that the key has already
been filled, which I think
he was trying to tell people no need to assign the key again.

```
/* we assume it isn't found because the hash was just created */
db = hash_search(dbhash, &newdb, HASH_ENTER, NULL);

/* hash_search already filled in the key */ <------- this
line, by Tom Lane
db->adl_score = score++;
/* next_worker is filled in later */
```

>
> - hash_search(uncommitted_enums, serialized++, HASH_ENTER, NULL);
> + (void) hash_search(uncommitted_enums, serialized++, HASH_ENTER, NULL);
>
> I prefer explicit (void) for new code, but others may disagree. I don't think we have a preferred style for this, so changing current usage will just cause unnecessary code churn.
>

What I am concerned about is that if we change the key after
hash_search with HASH_ENTER action, there
are chances that if we assign a wrong value, it will be impossible to
match that entry again.

> --
> John Naylor
> EDB: http://www.enterprisedb.com

--
Regards
Junwang Zhao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2023-09-13 08:55:23 Re: Cleaning up array_in()
Previous Message John Naylor 2023-09-13 08:22:29 Re: [dynahash] do not refill the hashkey after hash_search