Re: Progress indication prototype

From: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Progress indication prototype
Date: 2010-09-16 06:47:51
Message-ID: AANLkTi=XrXiaT5ecUsQ-tu09D_RLOShQKRPaDfoipRzJ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 17, 2010 at 2:19 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> Here is a small prototype for a query progress indicator.

I read and tested the patch. Here are comments to the code itself.

- Oid of pg_stat_get_backend_progress() must be changed because we are using
the id for another function.
- One complier warning:
copy.c:1702: warning: ‘file_size’ may be used uninitialized in this function
- We can move the division "work_done/work_total" to outside of
st_changecount++ block.

> Past discussions seemed to indicate that the best place to report this
> would be in pg_stat_activity.

Agreed. BTW, "query_progress" column shows NaN if progress
counter is unavailable, but NULL would be better.

> VACUUM (lazy) (also autovacuum), table-rewriting ALTER TABLE
We could also support VACUUM FULL, CLUSTER, CREATE INDEX and REINDEX.

> COPY out from table, COPY in from file,
COPY FROM STDIN shows Infinity, but NULL might be better, too.

> a very simple query.
SELECT * FROM tbl;
can report reasonable progress, but
SELECT count(*) FROM tbl;
cannot, because planned_tuple_count of the aggregation is 1.
I hope better solutions for the grouping case because they are used
in complex queries, where the progress counter is eagerly wanted.

> - Are the interfaces OK?

I like the new column in pg_stat_activity to "pull" the progress.
In addition, as previously discussed, we could also have "push"
notifications; Ex. GUC parameter "notice_per_progress" (0.0-1.0),
or periodical NOTIFY messages.

> - Is this going to be too slow to be useful?
> - Should there be a separate switch to turn it on (currently
> track_activities)?

I think we can always track the counters because shared memory
based counters are lightweight enough.

> - How to handle commands that process multiple tables?  For example,
> lazy VACUUM on a single table is pretty easy to track (count the block
> numbers), but what about a database-wide lazy VACUUM?

Not only a database-wide lazy VACUUM but also some of maintenance
commands have non-linear progress -- Progress of index scans in VACUUM
is not linear. ALTER TABLE could have REINDEX after table rewrites.

We might need to have arbitrary knowledges for the non-uniform commands;
For example, "CREATE INDEX assigns 75% of the progress for table scan,
and 25% for the final merging of tapes".

--
Itagaki Takahiro

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Itagaki Takahiro 2010-09-16 06:58:36 Re: patch: SQL/MED(FDW) DDL
Previous Message Heikki Linnakangas 2010-09-16 06:47:43 Re: bad variable subst after "AS"