From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Perform COPY FROM encoding conversions in larger chunks |
Date: | 2021-04-01 08:09:02 |
Message-ID: | 998bc295-874a-97a1-3291-7747e73434bd@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 18/03/2021 20:05, John Naylor wrote:
> I wrote:
>
> > I went ahead and rebased these.
Thanks!
> I also wanted to see if this patch set had any performance effect, with
> and without changing how UTF-8 is validated, using the blackhole am from
> https://github.com/michaelpq/pg_plugins/tree/master/blackhole_am
> <https://github.com/michaelpq/pg_plugins/tree/master/blackhole_am>.
>
> create extension blackhole_am;
> create table blackhole_tab (a text) using blackhole_am ;
> time ./inst/bin/psql -c "copy blackhole_tab from '/path/to/test-copy.txt'"
>
> ....where copy-test.txt is made by
>
> for i in {1..100}; do cat UTF-8-Sampler.htm >> test-copy.txt ; done;
>
> On Linux x86-64, gcc 8.4, I get these numbers (minimum of five runs):
>
> master:
> 109ms
>
> v6 do encoding in larger chunks:
> 109ms
>
> v7 utf8 SIMD:
> 98ms
That's disappointing. Perhaps the file size is just too small to see the
effect? I'm seeing results between 40 ms and 75 ms on my laptop when I
run a test like that multiple times. I used "WHERE false" instead of the
blackhole AM but I don't think that makes much difference (only showing
a few runs here for brevity):
for i in {1..100}; do cat /tmp/utf8.html >> /tmp/test-copy.txt ; done;
postgres=# create table blackhole_tab (a text) ;
CREATE TABLE
postgres=# \timing
Timing is on.
postgres=# copy blackhole_tab from '/tmp/test-copy.txt' where false;
COPY 0
Time: 53.166 ms
postgres=# copy blackhole_tab from '/tmp/test-copy.txt' where false;
COPY 0
Time: 43.981 ms
postgres=# copy blackhole_tab from '/tmp/test-copy.txt' where false;
COPY 0
Time: 71.850 ms
postgres=# copy blackhole_tab from '/tmp/test-copy.txt' where false;
COPY 0
...
I tested that with a larger file:
for i in {1..10000}; do cat /tmp/utf8.html >> /tmp/test-copy.txt ; done;
postgres=# copy blackhole_tab from '/tmp/test-copy.txt' where false;
v6 do encoding in larger chunks (best of five):
Time: 3955.514 ms (00:03.956)
master (best of five):
Time: 4133.767 ms (00:04.134)
So with that, I'm seeing a measurable difference.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | YoungHwan Joo | 2021-04-01 08:09:28 | [GSoC 2021 Proposal] Develop Performance Farm Benchmarks and Website |
Previous Message | Arseny Sher | 2021-04-01 07:58:25 | Re: Flaky vacuum truncate test in reloptions.sql |