Re: [PATCH] Initial progress reporting for COPY command

From: Josef Šimánek <josef(dot)simanek(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: [PATCH] Initial progress reporting for COPY command
Date: 2020-06-21 11:40:34
Message-ID: CAFp7QwoXHPNC16fzh1OEvSH3rKJe-AptAE3oSjbjMUpF_ugX9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for all comments. I have updated code to support more options
(including STDIN/STDOUT) and added some documentation.

Patch is attached and can be found also at
https://github.com/simi/postgres/pull/5.

Diff version: https://github.com/simi/postgres/pull/5.diff
Patch version: https://github.com/simi/postgres/pull/5.patch

I'm also attaching screenshot of HTML documentation and html documentation
file.

I'll do my best to get this to commitfest now.

ne 14. 6. 2020 v 14:32 odesílatel Josef Šimánek <josef(dot)simanek(at)gmail(dot)com>
napsal:

> Hello, as proposed by Pavel Stěhule and discussed on local czech
> PostgreSQL maillist (
> https://groups.google.com/d/msgid/postgresql-cz/CAFj8pRCZ42CBCa1bPHr7htffSV%2BNAcgcHHG0dVqOog4bsu2LFw%40mail.gmail.com?utm_medium=email&utm_source=footer),
> I have prepared an initial patch for COPY command progress reporting.
>
> Few examples first:
>
> "COPY (SELECT * FROM test) TO '/tmp/ids';"
>
> yr=# SELECT * from pg_stat_progress_copy;
> pid | datid | datname | relid | direction | file | program |
> lines_processed | file_bytes_processed
>
> ---------+-------+---------+-------+-----------+------+---------+-----------------+----------------------
> 3347126 | 16384 | yr | 0 | TO | t | f |
> 3529943 | 24906226
> (1 row)
>
> "COPY test FROM '/tmp/ids';
>
> yr=# SELECT * from pg_stat_progress_copy;
> pid | datid | datname | relid | direction | file | program |
> lines_processed | file_bytes_processed
>
> ---------+-------+---------+-------+-----------+------+---------+-----------------+----------------------
> 3347126 | 16384 | yr | 16385 | FROM | t | f |
> 121591999 | 957218816
> (1 row)
>
> Columns are inspired by CREATE INDEX progress report system view.
>
> pid - integer - PID of backend
> datid - oid - OID of related database
> datname - name - name of related database (this seems redundant, since oid
> should be enough, but it is the same in CREATE INDEX)
> relid - oid - oid of table related to COPY command, when not known (for
> example when copying to file, it is 0)
> direction - text - one of "FROM" or "TO" depends on command used
> file - bool - is file is used?
> program - bool - is program used?
> lines_processed - bigint - amount of processed lines, works for both
> directions (FROM/TO)
> file_bytes_processed - amount of bytes processed when file is used
> (otherwise 0), works for both direction (
> FROM/TO) when file is used (file = t)
>
> Patch is attached and can be found also at
> https://github.com/simi/postgres/pull/5.
>
> Diff version: https://github.com/simi/postgres/pull/5.diff
> Patch version: https://github.com/simi/postgres/pull/5.patch
>
> I havefew initial notes and questions.
>
> I'm using ftell to get current position in file to populate
> file_bytes_processed without error handling (ftell can return -1L and also
> populate errno on problems).
>
> 1. Is that a good way to get progress of file processing?
> 2. Is it safe in given context to not care about errors? If not, what to
> do on error?
>
> Some columns are not populated on certain COPY commands. For example when
> a file is not used, file_bytes_processed is set to 0. Would it be better to
> use NULL instead when the column is not related to the current command?
> Same problem is for relid column.
>
> I have not found any tests for progress reporting. Are there any? It would
> need two backends running (one running COPY, one checking output of report
> view). Is there any similar test I can inspire at? In theory, it should be
> possible to use dblink_send_query to run async COPY command in the
> background.
>
> My initial (attached) patch also doesn't introduce documentation for this
> system view. I can add that later once this patch is finalized (if that
> happens).
>

Attachment Content-Type Size
progress-reporting.html text/html 45.1 KB
copy-progress-v2.patch text/x-patch 20.7 KB
copy-progress-v2.diff text/x-patch 13.1 KB
copy-progress-docs.jpg image/jpeg 149.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2020-06-21 13:54:09 Re: Possible NULL pointer deferenced (src/interfaces/libpq/fe-exec.c (line 563)
Previous Message Josef Šimánek 2020-06-21 11:34:29 Re: [PATCH] Initial progress reporting for COPY command