AW: PG14: Avoid checking output-buffer-length for every encoded byte during pg_hex_encode

From: Hans Buschmann <buschmann(at)nidsa(dot)net>
To: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: AW: PG14: Avoid checking output-buffer-length for every encoded byte during pg_hex_encode
Date: 2021-08-16 16:19:39
Message-ID: 1629130786217.38835@nidsa.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


________________________________
Von: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
Gesendet: Montag, 16. August 2021 17:04
An: Hans Buschmann
Cc: pgsql-hackers(at)postgresql(dot)org
Betreff: Re: PG14: Avoid checking output-buffer-length for every encoded byte during pg_hex_encode

Hello Ranier,

Thank you for your quick response.

>Is always good to remove immutables from loops, if safe and secure.

I think it's worse because a loop changed variable is involved in the test, so it seems to be not immutable.

>Decode has regression too.

good catch, I overlooked it.

>I think that we can take one more step here.
>pg_hex_decode can be improved too.

+1

By looking a bit more precisely I detected the same checks in common/base64.c also involving loop-changed variables. Could you please make the same changes to base64.c?

>We can remove limitations from all functions that use hex functions.
>byteaout from (varlena.c) has an artificial limitation to handle only MaxAllocSize length, but
>nothing prevents her from being prepared for that limitation to be removed one day.

+1, but I don't know all implications of the type change to size_t

>Please, take a look at the new version attached.

Two remarks for decoding (by eyeball inspection of patch file only because of still struggling with git etc.):

1. The odd-number-of-digits-check for decoding is still part of the loop. It should be before the loop and called only once (by testing for an even number of srclen)
So we can eliminate the block if (s == srcend)? {} inside the loop!

2. I noticed that the error messages for hex_decode should be changed as

s/encoding/decoding/

(as in pg_log_fatal("overflow of destination buffer in hex encoding");)

>If possible can you review the tests if there is an overflow at
>pg_hex_encode and pg_hex_decode functions?

I would prefer to wait for another patch version from you (my development troubles), perhaps also dealing with base64.c

I don't know how and where any call to the encoding/decoding functions creates an overflow situation in normal operation.

I will nonceless try the tests but I don't have any experience with it.

BTW the root cause for my work is an attempt to vastly accelerate these functions (hex and base64 encode/decode), but this is left for another day (not finished yet) and certainly only on master/new development.

Lateron support on this topic would be highly appreciated...

Best regards,
Hans Buschmann

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Álvaro Herrera 2021-08-16 17:00:00 Re: Autovacuum on partitioned table (autoanalyze)
Previous Message Tom Lane 2021-08-16 16:17:58 Re: Autovacuum on partitioned table (autoanalyze)