Re: Inconvenience of pg_read_binary_file()

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inconvenience of pg_read_binary_file()
Date: 2022-07-29 07:21:45
Message-ID: 20220729.162145.1850303588308173872.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for taking a look!

At Thu, 28 Jul 2022 16:22:17 -0400, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote in
> Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> writes:
> > - Simplified the implementation (by complexifying argument handling..).
> > - REVOKEd EXECUTE from the new functions.
> > - Edited the signature of the two functions.
>
> >> - pg_read_file ( filename text [, offset bigint, length bigint [, missing_ok boolean ]] ) → text
> >> + pg_read_file ( filename text [, offset bigint, length bigint ] [, missing_ok boolean ] ) → text
>
> I'm okay with allowing this variant of the functions. Since there's
> no implicit cast between bigint and bool, plus the fact that you
> can't give just offset without length, there shouldn't be much risk
> of confusion as to which variant to invoke.

Grad to hear that.

> I don't really like the implementation style though. That mess of
> PG_NARGS tests is illegible code already and this makes it worse.

Ah..., I have to admit that I faintly felt that feeling while on it...

> I think it'd be way cleaner to have all the PG_GETARG calls in the
> bottom SQL-callable functions (which are already one-per-signature)
> and then pass them on to a common function that has an ordinary C
> call signature, along the lines of
>
> static Datum
> pg_read_file_common(text *filename_t,
> int64 seek_offset, int64 bytes_to_read,
> bool read_to_eof, bool missing_ok)
> {
> if (read_to_eof)
> bytes_to_read = -1; // or just Assert that it's -1 ?

I prefer assertion since that parameter cannot be passed by users.

> else if (bytes_to_read < 0)
> ereport(...);
> ...
> }

This function cannot return NULL directly. Without the ability to
return NULL, it is pointless for the function to return Datum. In the
attached the function returns text*.

> Datum
> pg_read_file_off_len(PG_FUNCTION_ARGS)
> {
> text *filename_t = PG_GETARG_TEXT_PP(0);
> int64 seek_offset = PG_GETARG_INT64(1);
> int64 bytes_to_read = PG_GETARG_INT64(2);
>
> return pg_read_file_common(filename_t, seek_offset, bytes_to_read,
> false, false);

As the result this function need to return NULL or TEXT_P according to
the returned value from pg_read_file_common.

+ if (!ret)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(ret);
> }

# I'm tempted to call read_text_file() directly from each SQL functions..

Please find the attached. I added some regression tests for both
pg_read_file() and pg_read_binary_file().

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v4-0001-Add-an-argument-variation-to-pg_read_file.patch text/x-patch 17.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Borisov 2022-07-29 08:11:31 Re: POC: Lock updated tuples in tuple_update() and tuple_delete()
Previous Message Amit Kapila 2022-07-29 06:45:01 Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns