Re: Improvements and additions to COPY progress reporting

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Josef Šimánek <josef(dot)simanek(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Subject: Re: Improvements and additions to COPY progress reporting
Date: 2021-02-20 06:09:22
Message-ID: CALj2ACUHvsTm3VFa3_baMK2y2jVYbA6Xff302Qesa1njRznUdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 19, 2021 at 2:34 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> > On Mon, 15 Feb 2021 at 17:07, Tomas Vondra
> > <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >>
> >> - The blocks in copyfrom.cc/copyto.c should be reworked - I don't think
> >> we do this in our codebase.
> >
> > I saw this being used in (re)index progress reporting, that's where I
> > took inspiration from. It has been fixed in the attached version.
> >
>
> Hmmm, good point. I haven't looked at the other places reporting
> progress and I only ever saw this pattern in old code. I kinda dislike
> these blocks, but admittedly that's rather subjective view. So if other
> similar places do this when reporting progress, this probably should
> too. What's your opinion on this?

Actually in the code base the style of that variable declaration and
usage of pgstat_progress_update_multi_param is a mix. For instance, in
lazy_scan_heap, ReindexRelationConcurrently, the variables are
declared at the start of the function. And in _bt_spools_heapscan,
index_build, validate_index, perform_base_backup, the variables are
declared within a separate block.

IMO, we can have the arrays declared at the start of the functions
i.e. the way it's done in v8-0001, because we can extend them for
reporting some other parameter(maybe in future).

> >> - I fir the "io_target" name misleading, because in some cases it's
> >> actually the *source*.
> >
> > Yes, I was also not quite happy with this, but couldn't find a better
> > one at the point of writing the initial patchset. Would
> > "io_operations", "io_port", "operates_through" or "through" maybe be
> > better?
> >
>
> No idea. Let's see if someone has a better proposal ...

For COPY TO the name "source_type" column and for COPY FROM the name
"destination_type" makes sense. To have a combined column name for
both, how about naming that column as "io_type"?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2021-02-20 06:35:45 Re: Asynchronous Append on postgres_fdw nodes.
Previous Message Bharath Rupireddy 2021-02-20 05:45:24 Re: New Table Access Methods for Multi and Single Inserts