Re: Perform COPY FROM encoding conversions in larger chunks

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Perform COPY FROM encoding conversions in larger chunks
Date: 2021-03-18 18:05:32
Message-ID: CAFBsxsEybzagsrmuoLsKYx417Sce9cgnM91nf8f9HKGLadixPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:

> I went ahead and rebased these.

It looks like FreeBSD doesn't like this for some reason.

I also wanted to see if this patch set had any performance effect, with and
without changing how UTF-8 is validated, using the blackhole am from
https://github.com/michaelpq/pg_plugins/tree/master/blackhole_am.

create extension blackhole_am;
create table blackhole_tab (a text) using blackhole_am ;
time ./inst/bin/psql -c "copy blackhole_tab from '/path/to/test-copy.txt'"

....where copy-test.txt is made by

for i in {1..100}; do cat UTF-8-Sampler.htm >> test-copy.txt ; done;

On Linux x86-64, gcc 8.4, I get these numbers (minimum of five runs):

master:
109ms

v6 do encoding in larger chunks:
109ms

v7 utf8 SIMD:
98ms

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-03-18 18:05:52 Re: DISTINCT term in aggregate function
Previous Message Pantelis Theodosiou 2021-03-18 18:03:03 DISTINCT term in aggregate function