Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, David Zhang <david(dot)zhang(at)highgo(dot)ca>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
Date: 2023-04-06 01:54:43
Message-ID: ZC4mY0EG7Homf5sU@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 05, 2023 at 04:35:37PM +0200, Matthias van de Meent wrote:
> I thought that the plan was to use int64 to skip checking for most
> overflows and to do a single check at the end in XLogRecordAssemble,
> so that the checking has minimal overhead in the performance-critical
> log record assembling path and reduced load on the branch predictor.

And that's the reason why your v11-0002 is better and simpler than the
v10-0001 I posted a few days ago.

+ if (regbuf->rdata_len + len > UINT16_MAX || len > UINT16_MAX)
+ ereport(ERROR,
+ (errmsg_internal("too much WAL data"),
+ errdetail_internal("Registering more than max %u bytes total to block %u: current %uB, adding %uB",
+ UINT16_MAX, block_id, regbuf->rdata_len, len)));

I was wondering for a few minutes about the second part of this
check.. But you are worried about the case where len is too large
that it would overflow rdata_len if calling XLogRegisterBufData() more
than once on the same block, if len is between
(UINT32_MAX-UINT16_MAX,UINT32_MAX) on the second call.

The extra errdetail_internal() could be tweaked a bit more, but I'm
also OK with your proposal, overall. One thing is "current %uB,
adding %uB" would be better using "bytes".

> One more issue that Andres was suggesting we'd fix was to allow XLog
> assembly separate from the actual XLog insertion:
> Currently you can't pre-assemble a record outside a critical section
> if the record must be inserted in a critical section, which makes e.g.
> commit records problematic due to the potentially oversized data
> resulting in ERRORs during record assembly. This would crash postgres
> because commit xlog insertion happens in a critical section. Having a
> pre-assembled record would greatly improve the ergonomics in that path
> and reduce the length of the critical path.
>
> I think it was something along the lines of the attached; 0001
> contains separated Commit/Abort record construction and insertion like
> Andres suggested,

I am honestly not sure whether we should complicate xloginsert.c this
way, but we could look at that for v17.

> 0002 does the size checks with updated error messages.

0002 can also be done before 0001, so I'd like to get that part
applied on HEAD before the feature freeze and close this thread. If
there are any objections, please feel free..
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-04-06 01:54:54 Re: cataloguing NOT NULL constraints
Previous Message Andres Freund 2023-04-06 01:46:16 Re: refactoring relation extension and BufferAlloc(), faster COPY