Re: pg_read_file() with virtual files returns empty string

From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: pg_read_file() with virtual files returns empty string
Date: 2020-07-01 21:43:37
Message-ID: 79fec3f2-7635-d936-596a-12518d5fecfe@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/1/20 5:17 PM, Joe Conway wrote:
> On 7/1/20 4:12 PM, Tom Lane wrote:
>> Joe Conway <mail(at)joeconway(dot)com> writes:
>>> I did some performance testing of the worst case/largest possible file and found
>>> that skipping the stat and bulk read does cause a significant regression.
>>
>> Yeah, I was wondering a little bit if that'd be an issue.
>>
>>> In the attached patch I was able to get most of the performance degradation back
>>> -- ~600ms. Hopefully you don't think what I did was "too cute by half" :-). Do
>>> you think this is good enough or should we go back to using the stat file size
>>> when it is > 0?
>>
>> I don't think it's unreasonable to "get in bed" with the innards of the
>> StringInfo; plenty of other places do already, such as pqformat.h or
>> pgp_armor_decode, just to name the first couple that I came across in a
>> quick grep.
>>
>> However, if we're going to get in bed with it, let's get all the way in
>> and just read directly into the StringInfo's buffer, as per attached.
>> This saves all the extra memcpy'ing and reduces the number of fread calls
>> to at most log(N).
>
> Works for me. I'll retest to see how well it does performance-wise and report back.

A quick test shows that this gets performance back on par with HEAD.

The only downside is that the max filesize is reduced to (MaxAllocSize -
MIN_READ_SIZE - 1) compared to MaxAllocSize with the old method.

But anyone pushing that size limit is going to run into other issues anyway. I.e
(on pg11):

8<---------------
select length(pg_read_binary_file('/tmp/rbftest4.bin'));
length

------------
1073737726
(1 row)

select pg_read_binary_file('/tmp/rbftest4.bin');
ERROR: invalid memory alloc request size 2147475455
8<---------------

So probably not worth worrying about.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2020-07-01 21:44:46 Re: Remove Deprecated Exclusive Backup Mode
Previous Message Sanaba, Bilva 2020-07-01 21:41:12 Adding Support for Copy callback functionality on COPY TO api