| From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
|---|---|
| To: | vignesh C <vignesh21(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Parallel copy |
| Date: | 2020-10-30 16:52:37 |
| Message-ID: | 9e7e11da-3de0-54d2-5646-5d56e956801f@iki.fi |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 30/10/2020 18:36, Heikki Linnakangas wrote:
> Whether the leader process finds the EOLs or the worker processes, it's
> pretty clear that it needs to be done ASAP, for a chunk at a time,
> because that cannot be done in parallel. I think some refactoring in
> CopyReadLine() and friends would be in order. It probably would be
> faster, or at least not slower, to find all the EOLs in a block in one
> tight loop, even when parallel copy is not used.
Something like the attached. It passes the regression tests, but it's
quite incomplete. It's missing handing of "\." as end-of-file marker,
and I haven't tested encoding conversions at all, for starters. Quick
testing suggests that this a little bit faster than the current code,
but the difference is small; I had to use a "WHERE false" option to
really see the difference.
The crucial thing here is that there's a new function, ParseLinesText(),
to find all end-of-line characters in a buffer in one go. In this patch,
it's used against 'raw_buf', but with parallel copy, you could point it
at a block in shared memory instead.
- Heikki
| Attachment | Content-Type | Size |
|---|---|---|
| 0001-WIP-Find-all-line-endings-in-COPY-in-chunks.patch | text/x-patch | 38.8 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | John Naylor | 2020-10-30 17:06:56 | Re: document pg_settings view doesn't display custom options |
| Previous Message | Tom Lane | 2020-10-30 16:48:24 | Re: document pg_settings view doesn't display custom options |