Re: Use streaming read API in ANALYZE

From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Subject: Re: Use streaming read API in ANALYZE
Date: 2024-02-28 11:42:34
Message-ID: CAN55FZ1T=YUhVbq6i9No76RY+4APZ-uVnwJhUzko_3wW30ReJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, 19 Feb 2024 at 18:13, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com> wrote:
>
> I worked on using the currently proposed streaming read API [1] in ANALYZE. The patch is attached. 0001 is the not yet merged streaming read API code changes that can be applied to the master, 0002 is the actual code.
>
> The blocks to analyze are obtained by using the streaming read API now.
>
> - Since streaming read API is already doing prefetch, I removed the #ifdef USE_PREFETCH code from acquire_sample_rows().
>
> - Changed 'while (BlockSampler_HasMore(&bs))' to 'while (nblocks)' because the prefetch mechanism in the streaming read API will advance 'bs' before returning buffers.
>
> - Removed BlockNumber and BufferAccessStrategy from the declaration of scan_analyze_next_block(), passing pgsr (PgStreamingRead) instead of them.
>
> I counted syscalls of analyzing ~5GB table. It can be seen that the patched version did ~1300 less read calls.
>
> Patched:
>
> % time seconds usecs/call calls errors syscall
> ------ ----------- ----------- --------- --------- ----------------
> 39.67 0.012128 0 29809 pwrite64
> 36.96 0.011299 0 28594 pread64
> 23.24 0.007104 0 27611 fadvise64
>
> Master (21a71648d3):
>
> % time seconds usecs/call calls errors syscall
> ------ ----------- ----------- --------- --------- ----------------
> 38.94 0.016457 0 29816 pwrite64
> 36.79 0.015549 0 29850 pread64
> 23.91 0.010106 0 29848 fadvise64
>
>
> Any kind of feedback would be appreciated.
>
> [1]: https://www.postgresql.org/message-id/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com

The new version of the streaming read API [1] is posted. I updated the
streaming read API changes patch (0001), using the streaming read API
in ANALYZE patch (0002) remains the same. This should make it easier
to review as it can be applied on top of master

[1]: https://www.postgresql.org/message-id/CA%2BhUKGJtLyxcAEvLhVUhgD4fMQkOu3PDaj8Qb9SR_UsmzgsBpQ%40mail.gmail.com

--
Regards,
Nazir Bilal Yavuz
Microsoft

Attachment Content-Type Size
v2-0001-Streaming-read-API-changes-that-are-not-committed.patch text/x-diff 55.0 KB
v2-0002-Use-streaming-read-API-in-ANALYZE.patch text/x-diff 8.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-02-28 11:44:59 Re: Synchronizing slots from primary to standby
Previous Message Bertrand Drouvot 2024-02-28 11:35:33 Re: Synchronizing slots from primary to standby