Re: pg_read_file() with virtual files returns empty string

From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: pg_read_file() with virtual files returns empty string
Date: 2020-07-02 21:30:49
Message-ID: cc993759-5360-c61f-a345-02fc99b9fb78@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/2/20 4:27 PM, Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>> When I saw originally MaxAllocSize - 5 fail I skipped to something smaller by
>> 4096 and it worked. But here I see that the actual max size is MaxAllocSize - 6.
>
> Huh, I wonder why it's not max - 5. Probably not worth worrying about,
> though.

Well this part:

+ rbytes = fread(sbuf.data + sbuf.len, 1,
+ (size_t) (sbuf.maxlen - sbuf.len - 1), file);

could actually be:

+ rbytes = fread(sbuf.data + sbuf.len, 1,
+ (size_t) (sbuf.maxlen - sbuf.len), file);

because there is no actual need to reserve a byte for the trailing null, since
we are not using appendBinaryStringInfo() anymore, and that is where the
trailing NULL gets written.

With that change (and some elog(NOTICE,...) calls) we have:

select length(pg_read_binary_file('/tmp/rbftest2.bin'));
NOTICE: loop start - buf max len: 1024; buf len 4
NOTICE: loop end - buf max len: 8192; buf len 8192
NOTICE: loop start - buf max len: 8192; buf len 8192
NOTICE: loop end - buf max len: 16384; buf len 16384
NOTICE: loop start - buf max len: 16384; buf len 16384
[...]
NOTICE: loop end - buf max len: 536870912; buf len 536870912
NOTICE: loop start - buf max len: 536870912; buf len 536870912
NOTICE: loop end - buf max len: 1073741823; buf len 1073741822
length
------------
1073741818
(1 row)

Or max - 5, so we got our byte back :-)

In fact, in principle there is no reason we can't get to max - 4 with this code
except that when the filesize is exactly 1073741819, we need to try to read one
more byte to find the EOF that way I did in my patch. I.e.:

-- use 1073741819 byte file
select length(pg_read_binary_file('/tmp/rbftest1.bin'));
NOTICE: loop start - buf max len: 1024; buf len 4
NOTICE: loop end - buf max len: 8192; buf len 8192
NOTICE: loop start - buf max len: 8192; buf len 8192
NOTICE: loop end - buf max len: 16384; buf len 16384
NOTICE: loop start - buf max len: 16384; buf len 16384
[...]
NOTICE: loop end - buf max len: 536870912; buf len 536870912
NOTICE: loop start - buf max len: 536870912; buf len 536870912
NOTICE: loop end - buf max len: 1073741823; buf len 1073741823
NOTICE: loop start - buf max len: 1073741823; buf len 1073741823
ERROR: requested length too large

Because we read the last byte, but not beyond, EOF is not reached, so on the
next loop iteration we continue and fail on max size rather than exit the loop.

But I am guessing that test in particular was what you thought too complicated
for what it accomplishes?

Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-07-02 21:37:46 Re: pg_read_file() with virtual files returns empty string
Previous Message Tom Lane 2020-07-02 21:25:21 Re: Warn when parallel restoring a custom dump without data offsets