Re: smgrzeroextend clarification

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: smgrzeroextend clarification
Date: 2023-05-10 12:58:09
Message-ID: CALj2ACU=wyfHFP7_mT75MhNaRAXFNG0390yiWWa7p6A1bGXePA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 10, 2023 at 3:20 PM Peter Eisentraut
<peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:
>
> I was looking at the new smgrzeroextend() function in the smgr API. The
> documentation isn't very extensive:
>
> /*
> * smgrzeroextend() -- Add new zeroed out blocks to a file.
> *
> * Similar to smgrextend(), except the relation can be extended by
> * multiple blocks at once and the added blocks will be filled with
> * zeroes.
> */
>
> The documentation of smgrextend() is:
>
> /*
> * smgrextend() -- Add a new block to a file.
> *
> * The semantics are nearly the same as smgrwrite(): write at the
> * specified position. However, this is to be used for the case of
> * extending a relation (i.e., blocknum is at or beyond the current
> * EOF). Note that we assume writing a block beyond current EOF
> * causes intervening file space to become filled with zeroes.
> */
>
> So if you want to understand what smgrzeroextend() does, you need to
> mentally combine the documentation of three different functions. Could
> we write documentation for each function that stands on its own? And
> document the function arguments, like what does blocknum and nblocks mean?

Why not?

> Moreover, the text "except the relation can be extended by multiple
> blocks at once and the added blocks will be filled with zeroes" doesn't
> make much sense as a differentiation, because smgrextend() does that as
> well.

Not exactly. smgrextend() doesn't write a zero-ed block on its own, it
writes the content that's passed to it via 'buffer'. It's just that
some of smgrextend() callers pass in a zero buffer. Whereas,
smgrzeroextend() writes zero-ed blocks on its own, something
smgrextend() called in a loop with zero-ed 'buffer'. Therefore, the
existing wording seems fine to me.

> AFAICT, the differences between smgrextend() and smgrzeroextend() are:
>
> 1. smgrextend() writes a payload block in addition to extending the
> file, smgrzeroextend() just extends the file without writing a payload.

I think how smgrzeroextend() extends the zeroed blocks is internal to
it, and mdzeroextend() happens to use fallocate (if available). I
think the existing wording around smgrzeroextend() seems fine to me.

> 2. smgrzeroextend() uses various techniques (posix_fallocate() etc.) to
> make sure the extended space is actually reserved on disk, smgrextend()
> does not.

It's not smgrzeroextend() per se, it is mdzeroextend() that uses
fallocate() if available.

Overall, +0.5 from me if we want to avoid comment traversals to
understand what these functions do and be more descriptive; we might
end up duplicating comments. But, I'm fine with ""except the relation
can be extended by multiple...." before smgrzeroextend().

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2023-05-10 13:00:01 Re: benchmark results comparing versions 15.2 and 16
Previous Message Jehan-Guillaume de Rorthais 2023-05-10 12:24:19 Re: Memory leak from ExecutorState context?