Re: Allow pageinspect's bt_page_stats function to return a set of rows instead of a single row

From: Hamid Akhtar <hamid(dot)akhtar(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allow pageinspect's bt_page_stats function to return a set of rows instead of a single row
Date: 2022-06-30 09:40:07
Message-ID: CANugjhtk-9WaLFG+oYL+FKwEqGej4zAEXX2Ac2C9vCidFgzNKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 30 Jun 2022 at 14:27, Bharath Rupireddy <
bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:

> On Thu, Jun 30, 2022 at 1:54 PM Hamid Akhtar <hamid(dot)akhtar(at)gmail(dot)com>
> wrote:
> >
> >> Do we have any
> >> difference in the execution times for the above query vs the new
> >> function introduced in the v1 patch? If there's not much difference, I
> >> would suggest adding an SQL function around the generate_series
> >> approach in the pageinspect extension for better and easier usability.
> >
> >
> > Based on some basic SQL execution time comparison of the two approaches,
> I see that the API change, on average, is around 40% faster than the SQL.
> >
> > CREATE TABLE test2 AS (SELECT generate_series(1, 5000000) AS col1);
> > CREATE INDEX test2_col1_idx ON test2(col1);
> >
> > EXPLAIN ANALYZE
> > SELECT * FROM bt_page_stats('test2_col1_idx', 1, 5000);
> >
> > EXPLAIN ANALYZE
> > SELECT * FROM GENERATE_SERIES(1, 5000) blkno,
> bt_page_stats('test2_col1_idx',blkno::int);
> >
> > For me, the API change returns back the data in around 74ms whereas the
> SQL returns it in 102ms. So considering this and as you mentioned, the
> alternative may not be that obvious to everyone, it is a fair improvement.
>
> I'm wondering what happens with a bit of huge data and different test
> cases each test case executed, say, 2 or 3 times.
>
> If the difference in execution times is always present, then the API
> approach or changing the core function would make more sense.
>

Technically, AFAIK, the performance difference will always be there.
Firstly, in the API change, there is no additional overhead of the
generate_series function. Additionally, with API change, looping over the
pages has a smaller overhead when compared with the overhead of the SQL
approach.

>
> Regards,
> Bharath Rupireddy.
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-06-30 09:43:21 Re: [PATCH] Log details for client certificate failures
Previous Message Bharath Rupireddy 2022-06-30 09:27:32 Re: Allow pageinspect's bt_page_stats function to return a set of rows instead of a single row