Re: ANALYZE command progress checker

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: vinayak <Pokale_Vinayak_q3(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, David Steele <david(at)pgmasters(dot)net>, David Fetter <david(at)fetter(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ANALYZE command progress checker
Date: 2017-03-24 19:30:12
Message-ID: CA+TgmoZ6DFfSe=C+ZmHu7SGLht+oncnHRgAz8Y5fnbj8o7WcJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On Fri, Mar 24, 2017 at 3:41 AM, vinayak
<Pokale_Vinayak_q3(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> I have updated the patch.

You can't change the definition of AcquireSampleRowsFunc without
updating the documentation in fdwhandler.sgml, but I think I don't
immediately understand why that's a thing we want to do at all if
neither file_fdw nor postgres_fdw are going to use the additional
argument. It seems to be entirely pointless because the new value
being passed down is always zero and even if the callee were to update
it, do_analyze_rel() would just ignore the change on return. Am I
missing something here? Also, the whitespace-only changes to fdwapi.h
related to AcquireSampleRowsFunc going to get undone by pgindent, so
even if there's some reason to change that you should leave the lines
that don't need changing untouched.

+ /* Reset rows processed to zero for the next column */
+ pgstat_progress_update_param(PROGRESS_ANALYZE_NUM_ROWS_PROCESSED,
0);

This seems like a bad design. If you see a value other than zero in
this field, you still won't know how much progress analyze has made,
because you don't know which column you're processing. I also feel
like you're not paying enough attention to a key point here that I
made in my last review, which is that you need to look at where
ANALYZE actually spends most of the time. If the process of computing
the per-row statistics consumes significant CPU time, then maybe
there's some point in trying to instrument it, but does it? Unless
you know the answer to that question, you don't know whether there's
even a point to trying to instrument this.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2017-03-24 19:40:24 Re: pg_stat_wal_write statistics view
Previous Message Alvaro Herrera 2017-03-24 19:21:18 Re: Re: [COMMITTERS] pgsql: Implement multivariate n-distinct coefficients