Re: Minimal logical decoding on standbys

From: tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimal logical decoding on standbys
Date: 2019-03-13 15:10:03
Message-ID: 53656d0c-1868-86e8-54a6-216ce74b353a@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi ,

I am getting a server crash on standby while executing
pg_logical_slot_get_changes function   , please refer this scenario

Master cluster( ./initdb -D master)
set wal_level='hot_standby in master/postgresql.conf file
start the server , connect to  psql terminal and create a physical
replication slot ( SELECT * from
pg_create_physical_replication_slot('p1');)

perform pg_basebackup using --slot 'p1'  (./pg_basebackup -D slave/ -R
--slot p1 -v))
set wal_level='logical' , hot_standby_feedback=on,
primary_slot_name='p1' in slave/postgresql.conf file
start the server , connect to psql terminal and create a logical
replication slot (  SELECT * from
pg_create_logical_replication_slot('t','test_decoding');)

run pgbench ( ./pgbench -i -s 10 postgres) on master and select
pg_logical_slot_get_changes on Slave database

postgres=# select * from pg_logical_slot_get_changes('t',null,null);
2019-03-13 20:34:50.274 IST [26817] LOG:  starting logical decoding for
slot "t"
2019-03-13 20:34:50.274 IST [26817] DETAIL:  Streaming transactions
committing after 0/6C000060, reading WAL from 0/6C000028.
2019-03-13 20:34:50.274 IST [26817] STATEMENT:  select * from
pg_logical_slot_get_changes('t',null,null);
2019-03-13 20:34:50.275 IST [26817] LOG:  logical decoding found
consistent point at 0/6C000028
2019-03-13 20:34:50.275 IST [26817] DETAIL:  There are no running
transactions.
2019-03-13 20:34:50.275 IST [26817] STATEMENT:  select * from
pg_logical_slot_get_changes('t',null,null);
TRAP: FailedAssertion("!(data == tupledata + tuplelen)", File:
"decode.c", Line: 977)
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: 2019-03-13
20:34:50.276 IST [26809] LOG:  server process (PID 26817) was terminated
by signal 6: Aborted

Stack trace -

(gdb) bt
#0  0x00007f370e673277 in raise () from /lib64/libc.so.6
#1  0x00007f370e674968 in abort () from /lib64/libc.so.6
#2  0x0000000000a30edf in ExceptionalCondition (conditionName=0xc36090
"!(data == tupledata + tuplelen)", errorType=0xc35f5c "FailedAssertion",
fileName=0xc35d70 "decode.c",
    lineNumber=977) at assert.c:54
#3  0x0000000000843c6f in DecodeMultiInsert (ctx=0x2ba1ac8,
buf=0x7ffd7a5136d0) at decode.c:977
#4  0x0000000000842b32 in DecodeHeap2Op (ctx=0x2ba1ac8,
buf=0x7ffd7a5136d0) at decode.c:375
#5  0x00000000008424dd in LogicalDecodingProcessRecord (ctx=0x2ba1ac8,
record=0x2ba1d88) at decode.c:125
#6  0x000000000084830d in pg_logical_slot_get_changes_guts
(fcinfo=0x2b95838, confirm=true, binary=false) at logicalfuncs.c:307
#7  0x000000000084846a in pg_logical_slot_get_changes (fcinfo=0x2b95838)
at logicalfuncs.c:376
#8  0x00000000006e5b9f in ExecMakeTableFunctionResult
(setexpr=0x2b93ee8, econtext=0x2b93d98, argContext=0x2b99940,
expectedDesc=0x2b97970, randomAccess=false) at execSRF.c:233
#9  0x00000000006fb738 in FunctionNext (node=0x2b93c80) at
nodeFunctionscan.c:94
#10 0x00000000006e52b1 in ExecScanFetch (node=0x2b93c80,
accessMtd=0x6fb67b <FunctionNext>, recheckMtd=0x6fba77
<FunctionRecheck>) at execScan.c:93
#11 0x00000000006e5326 in ExecScan (node=0x2b93c80, accessMtd=0x6fb67b
<FunctionNext>, recheckMtd=0x6fba77 <FunctionRecheck>) at execScan.c:143
#12 0x00000000006fbac1 in ExecFunctionScan (pstate=0x2b93c80) at
nodeFunctionscan.c:270
#13 0x00000000006e3293 in ExecProcNodeFirst (node=0x2b93c80) at
execProcnode.c:445
#14 0x00000000006d8253 in ExecProcNode (node=0x2b93c80) at
../../../src/include/executor/executor.h:241
#15 0x00000000006daa4e in ExecutePlan (estate=0x2b93a28,
planstate=0x2b93c80, use_parallel_mode=false, operation=CMD_SELECT,
sendTuples=true, numberTuples=0,
    direction=ForwardScanDirection, dest=0x2b907e0, execute_once=true)
at execMain.c:1643
#16 0x00000000006d8865 in standard_ExecutorRun (queryDesc=0x2afff28,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:362
#17 0x00000000006d869b in ExecutorRun (queryDesc=0x2afff28,
direction=ForwardScanDirection, count=0, execute_once=true) at
execMain.c:306
#18 0x00000000008ccef1 in PortalRunSelect (portal=0x2b36168,
forward=true, count=0, dest=0x2b907e0) at pquery.c:929
#19 0x00000000008ccb90 in PortalRun (portal=0x2b36168,
count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x2b907e0, altdest=0x2b907e0, completionTag=0x7ffd7a513e90 "")
    at pquery.c:770
#20 0x00000000008c6b58 in exec_simple_query (query_string=0x2adc1e8
"select * from pg_logical_slot_get_changes('t',null,null);") at
postgres.c:1215
#21 0x00000000008cae88 in PostgresMain (argc=1, argv=0x2b06590,
dbname=0x2b063d0 "postgres", username=0x2ad8da8 "centos") at postgres.c:4256
#22 0x0000000000828464 in BackendRun (port=0x2afe3b0) at postmaster.c:4399
#23 0x0000000000827c42 in BackendStartup (port=0x2afe3b0) at
postmaster.c:4090
#24 0x0000000000824036 in ServerLoop () at postmaster.c:1703
#25 0x00000000008238ec in PostmasterMain (argc=3, argv=0x2ad6d00) at
postmaster.c:1376
#26 0x0000000000748aab in main (argc=3, argv=0x2ad6d00) at main.c:228
(gdb)

regards,

On 03/07/2019 09:03 PM, tushar wrote:
> There is an another issue , where i am getting error while executing
> "pg_logical_slot_get_changes" on SLAVE
>
> Master (running on port=5432) -  run "make installcheck"  after
> setting  PATH=<installation/bin:$PATH )  and export
> PGDATABASE=postgres from regress/ folder
> Slave (running on port=5555)  -  Connect to regression database and
> select pg_logical_slot_get_changes
>
> [centos(at)mail-arts bin]$ ./psql postgres -p 5555 -f t.sql
> You are now connected to database "regression" as user "centos".
>  slot_name |    lsn
> -----------+-----------
>  m61       | 1/D437AD8
> (1 row)
>
> psql:t.sql:3: ERROR:  could not resolve cmin/cmax of catalog tuple
>
> [centos(at)mail-arts bin]$ cat t.sql
> \c regression
> SELECT * from   pg_create_logical_replication_slot('m61',
> 'test_decoding');
> select * from pg_logical_slot_get_changes('m61',null,null);
>
> regards,
>
> On 03/04/2019 10:57 PM, Andres Freund wrote:
>> Hi,
>>
>> On 2019-03-04 16:54:32 +0530, tushar wrote:
>>> On 03/01/2019 11:16 PM, Andres Freund wrote:
>>>> So, if I understand correctly you do*not*  have a phyiscal replication
>>>> slot for this standby? For the feature to work reliably that needs to
>>>> exist, and you need to have hot_standby_feedback enabled. Does having
>>>> that fix the issue?
>>> Ok, This time around  - I performed like this -
>>>
>>> .)Master cluster (set wal_level=logical and hot_standby_feedback=on in
>>> postgresql.conf) , start the server and create a physical
>>> replication slot
>> Note that hot_standby_feedback=on needs to be set on a standby, not on
>> the primary (although it doesn't do any harm there).
>>
>> Thanks,
>>
>> Andres
>>
>

--
regards,tushar
EnterpriseDB https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sergei Kornilov 2019-03-13 15:17:21 Re: using index or check in ALTER TABLE SET NOT NULL
Previous Message Tom Lane 2019-03-13 15:00:17 Re: using index or check in ALTER TABLE SET NOT NULL