Re: Avoiding smgrimmedsync() during nbtree index builds

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Avoiding smgrimmedsync() during nbtree index builds
Date: 2022-01-17 17:22:07
Message-ID: 20220117172207.GB14051@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 16, 2022 at 02:25:59PM -0600, Justin Pryzby wrote:
> On Thu, Jan 13, 2022 at 09:52:55AM -0600, Justin Pryzby wrote:
> > This is failing on windows CI when I use initdb --data-checksums, as attached.
> >
> > https://cirrus-ci.com/task/5612464120266752
> > https://api.cirrus-ci.com/v1/artifact/task/5612464120266752/regress_diffs/src/test/regress/regression.diffs
> >
> > +++ c:/cirrus/src/test/regress/results/bitmapops.out 2022-01-13 00:47:46.704621200 +0000
> > ..
> > +ERROR: could not read block 0 in file "base/16384/30310": read only 0 of 8192 bytes
>
> The failure isn't consistent, so I double checked my report. I have some more
> details:
>
> The problem occurs maybe only ~25% of the time.
>
> The issue is in the 0001 patch.
>
> data-checksums isn't necessary to hit the issue.
>
> errlocation says: LOCATION: mdread, md.c:686 (the only place the error
> exists)
>
> With Andres' windows crash patch, I obtained a backtrace - attached.
> https://cirrus-ci.com/task/5978171861368832
> https://api.cirrus-ci.com/v1/artifact/task/5978171861368832/crashlog/crashlog-postgres.exe_0fa8_2022-01-16_02-54-35-291.txt
>
> Maybe its a race condition or synchronization problem that nowhere else tends
> to hit.

I meant to say that I had not seen this issue anywhere but windows.

But now, by chance, I still had the 0001 patch in my tree, and hit the same
issue on linux:

https://cirrus-ci.com/task/4550618281934848
+++ /tmp/cirrus-ci-build/src/bin/pg_upgrade/tmp_check/regress/results/tuplesort.out 2022-01-17 16:06:35.759108172 +0000
EXPLAIN (COSTS OFF)
SELECT id, noabort_increasing, noabort_decreasing FROM abbrev_abort_uuids ORDER BY noabort_increasing LIMIT 5;
+ERROR: could not read block 0 in file "base/16387/t3_36794": read only 0 of 8192 bytes

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Finnerty, Jim 2022-01-17 17:30:45 Re: ICU for global collation
Previous Message David G. Johnston 2022-01-17 16:50:42 Re: Refactoring of compression options in pg_basebackup