Re: explain analyze rows=%.0f

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: explain analyze rows=%.0f
Date: 2022-07-07 10:14:37
Message-ID: CALDaNm3inETBcHYyBeevfKCjaxB24N4BVhDiZVdBooEdtxi1_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 23, 2022 at 2:25 AM Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com> wrote:
>
>
>
> On Thu, Jun 23, 2022 at 1:04 AM David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>>
>> On Wed, Jun 22, 2022 at 12:11 PM Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com> wrote:
>>>
>>> On Thu, Jun 23, 2022 at 12:01 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>>
>>>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>>> > On Jun 2, 2009, at 9:41 AM, Simon Riggs <simon(at)2ndQuadrant(dot)com> wrote:
>>>> >> You're right that the number of significant digits already exceeds the
>>>> >> true accuracy of the computation. I think what Robert wants to see is
>>>> >> the exact value used in the calc, so the estimates can be checked more
>>>> >> thoroughly than is currently possible.
>>>>
>>>> > Bingo.
>>>>
>>>> Uh, the planner's estimate *is* an integer. What was under discussion
>>>> (I thought) was showing some fractional digits in the case where EXPLAIN
>>>> ANALYZE is outputting a measured row count that is an average over
>>>> multiple loops, and therefore isn't necessarily an integer. In that
>>>> case the measured value can be considered arbitrarily precise --- though
>>>> I think in practice one or two fractional digits would be plenty.
>>>>
>>>> regards, tom lane
>>>>
>>>>
>>> Hi,
>>> I was looking at the TODO list and found that the issue requires
>>> a quick fix. Attached is a patch which shows output like this.
>>
>>
>> Quick code review:
>>
>> + "actual rows=%.0f loops=%.0f": " rows=%.2f loops=%.0f",
>>
>> The leading space before the else block "rows" does not belong.
>>
>> There should be a space after the colon.
>>
> Thanks, David for your quick response. I have updated the patch.
>
>>
>> The word "actual" that you are dropping in the else block seems like it should belong - it is a header for the entire section not just a modifier for the word "rows". This is evidenced by the timing block verbiage where rows is standalone and the word actual comes before time. In short, only the format specifier should change under the current scheme. Both sections.
>>
>> - WRITE_FLOAT_FIELD(rows, "%.0f");
>> + WRITE_FLOAT_FIELD(rows, "%.2f");
>>
>> This one looks suspicious, though I haven't dug into the code to see exactly what all is being touched. That it doesn't have an nloops condition like everything else stands out.
>>
> I was also thinking about that, but I don't see any harm when we ultimately truncating that decimal
> at a latter stage of code in case of loop = 1.

Thanks for the patch.

1) There are some existing regression tests that are failing, you
should update the expect files accordingly for the same:
--- /home/vignesh/postgres/src/test/regress/expected/select_parallel.out
2022-05-18 20:51:46.874818044 +0530
+++ /home/vignesh/postgres/src/test/regress/results/select_parallel.out
2022-07-07 15:27:34.450440922 +0530
@@ -545,17 +545,17 @@
explain (analyze, timing off, summary off, costs off)
select count(*) from tenk1, tenk2 where tenk1.hundred > 1
and tenk2.thousand=0;
- QUERY PLAN
---------------------------------------------------------------------------
+ QUERY PLAN
+-----------------------------------------------------------------------------
Aggregate (actual rows=1 loops=1)
-> Nested Loop (actual rows=98000 loops=1)
-> Seq Scan on tenk2 (actual rows=10 loops=1)
Filter: (thousand = 0)
Rows Removed by Filter: 9990
- -> Gather (actual rows=9800 loops=10)
+ -> Gather (actual rows=9800.00 loops=10)
Workers Planned: 4
Workers Launched: 4
- -> Parallel Seq Scan on tenk1 (actual rows=1960 loops=50)
+ -> Parallel Seq Scan on tenk1 (actual rows=1960.00 loops=50)
Filter: (hundred > 1)

test select_parallel ... FAILED 744 ms
partition_prune ... FAILED 861 ms
explain ... FAILED 134 ms
memoize ... FAILED 250 ms

2) This change is not required as part of this patch:
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -122,7 +122,7 @@ bool bsysscan = false;
* lookups as fast as possible.
*/
static FullTransactionId XactTopFullTransactionId = {InvalidTransactionId};
-static int nParallelCurrentXids = 0;
+static int nParallelCurrentXids = 0;
static TransactionId *ParallelCurrentXids;

Regards,
Vignesh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message wangw.fnst@fujitsu.com 2022-07-07 10:20:45 RE: Perform streaming logical transactions by background workers and parallel apply
Previous Message Peter Eisentraut 2022-07-07 10:09:32 Re: [RFC] building postgres with meson -v9