Re: pg15b3: crash in paralell vacuum

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Amit Kapila <akapila(at)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: pg15b3: crash in paralell vacuum
Date: 2022-08-18 14:04:15
Message-ID: 20220818140415.GN26426@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 18, 2022 at 08:34:06AM -0500, Justin Pryzby wrote:
> Unfortunately, it looks like the RPM packages are compiled with -O2, so this is
> of limited use. So I'll be back shortly with more...

#3 0x00000000006874f1 in parallel_vacuum_process_all_indexes (pvs=0x25bdce0, num_index_scans=0, vacuum=vacuum(at)entry=false) at vacuumparallel.c:611
611 Assert(indstats->status == PARALLEL_INDVAC_STATUS_INITIAL);

(gdb) p *pvs
$1 = {pcxt = 0x25bc1e0, indrels = 0x25bbf70, nindexes = 8, shared = 0x7fc5184393a0, indstats = 0x7fc5184393e0, dead_items = 0x7fc5144393a0, buffer_usage = 0x7fc514439280, wal_usage = 0x7fc514439240,
will_parallel_vacuum = 0x266d818, nindexes_parallel_bulkdel = 5, nindexes_parallel_cleanup = 0, nindexes_parallel_condcleanup = 5, bstrategy = 0x264f120, relnamespace = 0x0, relname = 0x0, indname = 0x0,
status = PARALLEL_INDVAC_STATUS_INITIAL}

(gdb) p *indstats
$2 = {status = 11, parallel_workers_can_process = false, istat_updated = false, istat = {num_pages = 0, estimated_count = false, num_index_tuples = 0, tuples_removed = 0, pages_newly_deleted = 0, pages_deleted = 1,
pages_free = 0}}

(gdb) bt f
...
#3 0x00000000006874f1 in parallel_vacuum_process_all_indexes (pvs=0x25bdce0, num_index_scans=0, vacuum=vacuum(at)entry=false) at vacuumparallel.c:611
indstats = 0x7fc5184393e0
i = 0
nworkers = 2
new_status = PARALLEL_INDVAC_STATUS_NEED_CLEANUP
__func__ = "parallel_vacuum_process_all_indexes"
#4 0x0000000000687ef0 in parallel_vacuum_cleanup_all_indexes (pvs=<optimized out>, num_table_tuples=num_table_tuples(at)entry=409149, num_index_scans=<optimized out>, estimated_count=estimated_count(at)entry=true)
at vacuumparallel.c:486
No locals.
#5 0x00000000004f80b8 in lazy_cleanup_all_indexes (vacrel=vacrel(at)entry=0x25bc510) at vacuumlazy.c:2679
reltuples = 409149
estimated_count = true
#6 0x00000000004f884a in lazy_scan_heap (vacrel=vacrel(at)entry=0x25bc510) at vacuumlazy.c:1278
rel_pages = 67334
blkno = 67334
next_unskippable_block = 67334
next_failsafe_block = 0
next_fsm_block_to_vacuum = 0
dead_items = 0x7fc5144393a0
vmbuffer = 1300
next_unskippable_allvis = true
skipping_current_range = false
initprog_index = {0, 1, 5}
initprog_val = {1, 67334, 11184809}
__func__ = "lazy_scan_heap"
#7 0x00000000004f925f in heap_vacuum_rel (rel=0x7fc52df6b820, params=0x7ffd74f74620, bstrategy=0x264f120) at vacuumlazy.c:534
vacrel = 0x25bc510
verbose = true
instrument = <optimized out>
aggressive = false
skipwithvm = true
frozenxid_updated = false
minmulti_updated = false
OldestXmin = 32759288
FreezeLimit = 4277726584
OldestMxact = 157411
MultiXactCutoff = 4290124707
orig_rel_pages = 67334
new_rel_pages = <optimized out>
new_rel_allvisible = 4
ru0 = {tv = {tv_sec = 1660830451, tv_usec = 473980}, ru = {ru_utime = {tv_sec = 0, tv_usec = 317891}, ru_stime = {tv_sec = 1, tv_usec = 212372}, {ru_maxrss = 74524, __ru_maxrss_word = 74524}, {ru_ixrss = 0,
__ru_ixrss_word = 0}, {ru_idrss = 0, __ru_idrss_word = 0}, {ru_isrss = 0, __ru_isrss_word = 0}, {ru_minflt = 18870, __ru_minflt_word = 18870}, {ru_majflt = 0, __ru_majflt_word = 0}, {ru_nswap = 0,
__ru_nswap_word = 0}, {ru_inblock = 1124750, __ru_inblock_word = 1124750}, {ru_oublock = 0, __ru_oublock_word = 0}, {ru_msgsnd = 0, __ru_msgsnd_word = 0}, {ru_msgrcv = 0, __ru_msgrcv_word = 0}, {ru_nsignals = 0,
__ru_nsignals_word = 0}, {ru_nvcsw = 42, __ru_nvcsw_word = 42}, {ru_nivcsw = 35, __ru_nivcsw_word = 35}}}
starttime = 714145651473980
startreadtime = 0
startwritetime = 0
startwalusage = {wal_records = 2, wal_fpi = 0, wal_bytes = 421}
StartPageHit = 50
StartPageMiss = 0
StartPageDirty = 0
errcallback = {previous = 0x0, callback = 0x4f5f41 <vacuum_error_callback>, arg = 0x25bc510}
indnames = 0x266d838
__func__ = "heap_vacuum_rel"

This is a qemu VM which (full disclosure) has crashed a few times recently due
to OOM. This is probably a postgres bug, but conceivably it's being tickled by
bad data (although the vm crashing shouldn't cause that, either, following
recovery). This is also an instance that was pg_upgraded from v14 (and earlier
versions) to v15b1 and then b2, so it's conceivably possible there's weird data
pages that wouldn't be written by beta3. But that doesn't seem to be the issue
here anyway.

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2022-08-18 14:06:50 Re: pg15b3: crash in paralell vacuum
Previous Message Bruce Momjian 2022-08-18 13:57:55 Re: Data caching