Re: pg_read_file() with virtual files returns empty string

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: pg_read_file() with virtual files returns empty string
Date: 2020-07-01 20:12:20
Message-ID: 749696.1593634340@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Joe Conway <mail(at)joeconway(dot)com> writes:
> I did some performance testing of the worst case/largest possible file and found
> that skipping the stat and bulk read does cause a significant regression.

Yeah, I was wondering a little bit if that'd be an issue.

> In the attached patch I was able to get most of the performance degradation back
> -- ~600ms. Hopefully you don't think what I did was "too cute by half" :-). Do
> you think this is good enough or should we go back to using the stat file size
> when it is > 0?

I don't think it's unreasonable to "get in bed" with the innards of the
StringInfo; plenty of other places do already, such as pqformat.h or
pgp_armor_decode, just to name the first couple that I came across in a
quick grep.

However, if we're going to get in bed with it, let's get all the way in
and just read directly into the StringInfo's buffer, as per attached.
This saves all the extra memcpy'ing and reduces the number of fread calls
to at most log(N).

(This also fixes a bug in your version, which is that it captured
the buf.data pointer before any repalloc that might happen.)

regards, tom lane

Attachment Content-Type Size
read-virtual-files.04.patch text/x-diff 3.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2020-07-01 20:19:27 Re: v12 and TimeLine switches and backups/restores
Previous Message Andrew Dunstan 2020-07-01 20:09:35 Re: POC: rational number type (fractions)