Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: ienieghapheoghaiwida(at)xff(dot)cz, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker
Date: 2019-11-21 10:39:40
Message-ID: 20191121103940.gpadc7xmssl63sad@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Nov 21, 2019 at 01:14:18AM +0000, PG Bug reporting form wrote:
>The following bug has been logged on the website:
>
>Bug reference: 16129
>Logged by: Ondrej Jirman
>Email address: ienieghapheoghaiwida(at)xff(dot)cz
>PostgreSQL version: 12.1
>Operating system: Arch Linux
>Description:
>
>Hello,
>
>I've upgraded my main PostgreSQL cluster from 11.5 to 12.1 via pg_dumpall
>method and after a while I started getting segfault in logical replication
>worker.
>
>My setup is fairly vanilla, non-default options:
>
>shared_buffers = 256MB
>work_mem = 512MB
>temp_buffers = 64MB
>maintenance_work_mem = 4GB
>effective_cache_size = 16GB
>max_logical_replication_workers = 30
>max_replication_slots = 30
>max_worker_processes = 30
>wal_level = logical
>
>I have several databases that I subscribe to from this database cluster
>using logical replication.
>
>Replication of one of my databases (running on ARMv7 machine) started
>segfaulting on the subscriber side (x86_64) like this:
>
>#0 0x00007fc259739917 in __memmove_sse2_unaligned_erms () from
>/usr/lib/libc.so.6
>#1 0x000055d033e93d44 in memcpy (__len=620701425, __src=<optimized out>,
>__dest=0x55d0356da804) at /usr/include/bits/string_fortified.h:34
>#2 tts_virtual_materialize (slot=0x55d0356da3b8) at execTuples.c:235
>#3 0x000055d033e94d32 in ExecFetchSlotHeapTuple
>(slot=slot(at)entry=0x55d0356da3b8, materialize=materialize(at)entry=true,
>shouldFree=shouldFree(at)entry=0x7fff0e7cf387) at execTuples.c:1624

Hmmm, so it's failing on this memcpy() in tts_virtual_materialize:

else
{
Size data_length = 0;

data = (char *) att_align_nominal(data, att->attalign);
data_length = att_addlength_datum(data_length, att->attlen, val);

memcpy(data, DatumGetPointer(val), data_length);

slot->tts_values[natt] = PointerGetDatum(data);
data += data_length;
}

The question is, which of the pointers is bogus. You seem to already
have a core file, so can you inspect the variables in frame #2? I think
especially

p *slot
p natt
p val
p *att

would be interesting to see.

Also, how does the replicated schema look like? Can we see the table
definitions?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Ondřej Jirman 2019-11-21 11:53:26 Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker
Previous Message Michael Paquier 2019-11-21 05:05:30 Re: BUG #16127: PostgreSQL 12.1 on Windows 2008 R2 copy table from ‘large 2GB csv’report “Unknown error”