| From: | "Matheus Alcantara" <matheusssilv97(at)gmail(dot)com> |
|---|---|
| To: | "Glauber Batista" <glauberrbatista(at)gmail(dot)com>, <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Autoprewarm workers terminated due to a segmentation fault |
| Date: | 2026-06-09 21:06:09 |
| Message-ID: | DJ4TOYZGE7M8.1U3BNIRMQVXQX@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
Hi,
On Tue Jun 9, 2026 at 3:37 PM -03, Glauber Batista wrote:
> I have an issue with the autoprewarm workers segfaulting during the service
> restart. Sometimes, it successfully restarts after a few tries, but usually
> I need to remove the autoprewarm.blocks file. My setup consists of a
> primary server with two replicas and all of them present the same issue. I
> have been using this setup for several years with no issues, but since I
> upgraded to Postgres 18 I'm having it. This is a production database.
>
> Details:
>
> [ ... ]
>
> All that said, it seems there's a missing guard-clause at line 649. I
> didn't spend much time reading the code, but it's clearly accessing a
> position in the array that is not allocated.
>
Thank you for the report!
I've managed to reproduce the issue with the following:
create table test_warm (id int, data text);
-- insert enough rows to generate too many pages on an index
insert into test_warm select g, repeat('a', 100) from generate_series(1, 5000000) g;
create index warm_idx on test_warm(id);
-- force read the index entirely into shared_buffers
select count(*) from test_warm where id > 0;
Then pg_ctl stop and pg_ctl start will start failing with the following logs:
2026-06-09 17:47:40.924 -03 [23025] LOG: shutting down
2026-06-09 17:47:40.925 -03 [23025] LOG: checkpoint starting: shutdown fast
2026-06-09 17:47:41.033 -03 [23022] LOG: database system is shut down
2026-06-09 17:47:49.830 -03 [23172] LOG: starting PostgreSQL 19beta1 on aarch64-darwin, compiled by clang-17.0.0, 64-bit
2026-06-09 17:47:49.842 -03 [23172] LOG: database system is ready to accept connections
2026-06-09 17:47:49.917 -03 [23172] LOG: background worker "autoprewarm worker" (PID 23182) was terminated by signal 11: Segmentation fault: 11
2026-06-09 17:47:49.917 -03 [23172] LOG: terminating any other active server processes
2026-06-09 17:47:49.918 -03 [23172] LOG: all server processes terminated; reinitializing
Wondering if the following would be enough?
/* Advance i past all the blocks just prewarmed. */
i = p.pos;
+ if (i >= apw_state->prewarm_stop_idx)
+ break;
+
blk = block_info[i];
With this change on a fresh start I got very similar shared hit buffers
on the first and second execution, so I think that the
autoprewarm.blocks was reloaded successfully?
postgres=# explain(analyze, costs off, timing off, summary off) select count(*) from test_warm where id > 0;
QUERY PLAN
-----------------------------------------------------------------------------------
Finalize Aggregate (actual rows=1.00 loops=1)
Buffers: shared hit=15965 read=70242
-> Gather (actual rows=3.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=15965 read=70242
-> Partial Aggregate (actual rows=1.00 loops=3)
Buffers: shared hit=15965 read=70242
-> Parallel Seq Scan on test_warm (actual rows=1666666.67 loops=3)
Filter: (id > 0)
Buffers: shared hit=15965 read=70242
postgres=# explain(analyze, costs off, timing off, summary off) SELECT count(*) FROM test_warm WHERE id > 0;
QUERY PLAN
-----------------------------------------------------------------------------------
Finalize Aggregate (actual rows=1.00 loops=1)
Buffers: shared hit=15970 read=70237
-> Gather (actual rows=3.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=15970 read=70237
-> Partial Aggregate (actual rows=1.00 loops=3)
Buffers: shared hit=15970 read=70237
-> Parallel Seq Scan on test_warm (actual rows=1666666.67 loops=3)
Filter: (id > 0)
Buffers: shared hit=15970 read=70237
Thoughts?
--
Matheus Alcantara
EDB: https://www.enterprisedb.com
| Attachment | Content-Type | Size |
|---|---|---|
| 0001-Fix-out-of-bounds-access-in-autoprewarm-worker.patch | text/plain | 1.6 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tomas Vondra | 2026-06-09 21:44:28 | Re: Autoprewarm workers terminated due to a segmentation fault |
| Previous Message | Glauber Batista | 2026-06-09 18:37:24 | Autoprewarm workers terminated due to a segmentation fault |