Re: BUG #14758: Segfault with logical replication on a function index

From: Scott Milliken <scott(at)deltaex(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14758: Segfault with logical replication on a function index
Date: 2017-07-28 19:35:27
Message-ID: CAM+Hv14R2aDmT45q8WWAspsVUC2+h+N9E=Pa-wr2aA+-QqVKfw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Thank you Masahiko! I've tested and confirmed that this patch fixes the
problem.

On Fri, Jul 28, 2017 at 3:07 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
wrote:

> On Mon, Jul 24, 2017 at 4:22 PM, <scott(at)deltaex(dot)com> wrote:
> > The following bug has been logged on the website:
> >
> > Bug reference: 14758
> > Logged by: Scott Milliken
> > Email address: scott(at)deltaex(dot)com
> > PostgreSQL version: 10beta2
> > Operating system: Linux 4.10.0-27-generic #30~16.04.2-Ubuntu S
> > Description:
> >
> > I'm testing logical replication on 10beta2, and found a segfault that I
> can
> > reliably reproduce with an index on a not-actually immutable function.
> >
> > Here's the function in question:
> >
> > ```
> > CREATE OR REPLACE FUNCTION public.immutable_random(integer)
> > RETURNS double precision
> > LANGUAGE sql
> > IMMUTABLE
> > AS $function$SELECT random();
> > $function$;
> > ```
> >
> > It's not actually immutable since it's calling random (a hack to get an
> > efficient random sort on a table).
> >
> > (Aside: I'd understand if it errored on creation of the index, but would
> > really prefer to keep using this instead of tablesample because it's
> fast,
> > deterministic, and doesn't have sampling biases like the SYSTEM
> sampling.)
> >
> >
> > Here's full reproduction instructions:
> >
> >
> > Primary:
> > ```
> > mkdir -p /tmp/test-seg0
> > PGPORT=5301 initdb -D /tmp/test-seg0
> > echo "wal_level = logical" >> /tmp/test-seg0/postgresql.conf
> > PGPORT=5301 pg_ctl -D /tmp/test-seg0 start
> > for (( ; ; )); do if pg_isready -d postgres -p 5301; then break; fi;
> sleep
> > 1; done
> > psql -p 5301 postgres -c "CREATE USER test WITH PASSWORD 'test' SUPERUSER
> > CREATEDB CREATEROLE LOGIN REPLICATION BYPASSRLS;"
> > createdb -p 5301 -E utf8 test
> >
> > psql -p 5301 -U test test -c "CREATE TABLE testtbl (id int, name text);"
> > psql -p 5301 -U test test -c "ALTER TABLE testtbl ADD CONSTRAINT
> > testtbl_pkey PRIMARY KEY (id);"
> > psql -p 5301 -U test test -c "CREATE PUBLICATION testpub FOR TABLE
> > testtbl;"
> > psql -p 5301 -U test test -c "INSERT INTO testtbl (id, name) VALUES (1,
> > 'a');"
> > ```
> >
> > Secondary:
> > ```
> > mkdir -p /tmp/test-seg1
> > PGPORT=5302 initdb -D /tmp/test-seg1
> > PGPORT=5302 pg_ctl -D /tmp/test-seg1 start
> > for (( ; ; )); do if pg_isready -d postgres -p 5302; then break; fi;
> sleep
> > 1; done
> > psql -p 5302 postgres -c "CREATE USER test WITH PASSWORD 'test' SUPERUSER
> > CREATEDB CREATEROLE LOGIN REPLICATION BYPASSRLS;"
> > createdb -p 5302 -E utf8 test
> >
> > psql -p 5302 -U test test -c "CREATE TABLE testtbl (id int, name text);"
> > psql -p 5302 -U test test -c "ALTER TABLE testtbl ADD CONSTRAINT
> > testtbl_pkey PRIMARY KEY (id);"
> > psql -p 5302 -U test test -c 'CREATE FUNCTION
> > public.immutable_random(integer) RETURNS double precision LANGUAGE sql
> > IMMUTABLE AS $function$ SELECT random(); $function$'
> > psql -p 5302 -U test test -c "CREATE INDEX ix_testtbl_random ON testtbl
> > USING btree (immutable_random(id));"
> > psql -p 5302 -U test test -c "CREATE SUBSCRIPTION test0_testpub
> CONNECTION
> > 'port=5301 user=test dbname=test' PUBLICATION testpub;"
> > ```
> >
> > The secondary crashes with a segfault:
> >
> > ```
> > 2017-07-23 23:55:37.961 PDT [4823] LOG: logical replication table
> > synchronization worker for subscription "test0_testpub", table "testtbl"
> > has started
> > 2017-07-23 23:55:38.244 PDT [4758] LOG: worker process: logical
> replication
> > worker for subscription 16396 sync 16386 (PID 4823) was terminated by
> signal
> > 11: Segmentation fault
> > 2017-07-23 23:55:38.244 PDT [4758] LOG: terminating any other active
> server
> > processes
> > 2017-07-23 23:55:38.245 PDT [4763] WARNING: terminating connection
> because
> > of crash of another server process
> > 2017-07-23 23:55:38.245 PDT [4763] DETAIL: The postmaster has commanded
> > this server process to roll back the current transaction and exit,
> because
> > another server process exited
> > abnormally and possibly corrupted shared memory.
> > 2017-07-23 23:55:38.245 PDT [4763] HINT: In a moment you should be able
> to
> > reconnect to the database and repeat your command.
> > 2017-07-23 23:55:38.247 PDT [4758] LOG: all server processes terminated;
> > reinitializing
> > 2017-07-23 23:55:38.256 PDT [4826] LOG: database system was interrupted;
> > last known up at 2017-07-23 23:55:36 PDT
> > 2017-07-23 23:55:38.809 PDT [4826] LOG: database system was not properly
> > shut down; automatic recovery in progress
> > 2017-07-23 23:55:38.812 PDT [4826] LOG: redo starts at 0/173AEA0
> > 2017-07-23 23:55:38.815 PDT [4826] LOG: invalid record length at
> 0/17B50B0:
> > wanted 24, got 0
> > 2017-07-23 23:55:38.815 PDT [4826] LOG: redo done at 0/17B5070
> > 2017-07-23 23:55:38.815 PDT [4826] LOG: last completed transaction was
> at
> > log time 2017-07-23 23:55:37.962957-07
> > ```
> >
>
> Thank you for the reporting and precise reproducing steps!
> I could reproduced this issue and it seems to me that the cause of
> this is that the table sync worker didn't get a snapshot before
> starting table copy. Attached patch fixes this problem.
>
> Regards,
>
> --
> Masahiko Sawada
> NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> NTT Open Source Software Center
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Rick Otten 2017-07-30 12:01:16 Re: signal 11 segfaults with parallel workers
Previous Message Tom Lane 2017-07-28 19:15:27 Re: behavior difference in character literal vs national character literal

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Sharma 2017-07-28 19:47:38 Re: pl/perl extension fails on Windows
Previous Message Tom Lane 2017-07-28 19:26:25 Re: map_partition_varattnos() and whole-row vars