Re: logical decoding - reading a user catalog table

From: Steve Singer <steve(at)ssinger(dot)info>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical decoding - reading a user catalog table
Date: 2014-10-29 02:38:29
Message-ID: BLU437-SMTP22260383F3D4568DF9860DDC9C0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/28/2014 01:31 PM, Andres Freund wrote:
> On 2014-10-25 18:18:07 -0400, Steve Singer wrote:
>> My logical decoding plugin is occasionally getting this error
>>
>> "could not resolve cmin/cmax of catalog tuple"
>>
>> I get this when my output plugin is trying to read one of the user defined
>> catalog tables (user_catalog_table=true)
> Hm. That should obviously not happen.
>
> Could you describe how that table is modified? Does that bug happen
> initially, or only after a while?

It doesn't happen right away, in this case it was maybe 4 minutes after
creating the slot.
The error also doesn't always happen when I run the this test workload
but it is reproducible with some trying.
I' don't do anything special to that table, it gets created then I do
inserts on it. I don't do an alter table or anything fancy like that.
I was running the slony failover test (all nodes under the same
postmaster) which involves the occasional dropping and recreating of
databases along with normal query load + replication.

I'll send you tar of the data directory off list with things in this state.

> Do you have a testcase that would allow me to easily reproduce the
> problem?

I don't have a isolated test case that does this. The test that I'm
hitting this with does lots of stuff and doesn't even always hit this.

>> I am not sure if this is a bug in the time-travel support in the logical
>> decoding support of if I'm just using it wrong (ie not getting a sufficient
>> lock on the relation or something).
> I don't know yet...
>
>> This is the interesting part of the stack trace
>>
>> #4 0x000000000091bbc8 in HeapTupleSatisfiesHistoricMVCC
>> (htup=0x7fffcf42a900,
>> snapshot=0x7f786ffe92d8, buffer=10568) at tqual.c:1631
>> #5 0x00000000004aedf3 in heapgetpage (scan=0x28d7080, page=0) at
>> heapam.c:399
>> #6 0x00000000004b0182 in heapgettup_pagemode (scan=0x28d7080,
>> dir=ForwardScanDirection, nkeys=0, key=0x0) at heapam.c:747
>> #7 0x00000000004b1ba6 in heap_getnext (scan=0x28d7080,
>> direction=ForwardScanDirection) at heapam.c:1475
>> #8 0x00007f787002dbfb in lookupSlonyInfo (tableOid=91754, ctx=0x2826118,
>> origin_id=0x7fffcf42ab8c, table_id=0x7fffcf42ab88,
>> set_id=0x7fffcf42ab84)
>> at slony_logical.c:663
>> #9 0x00007f787002b7a3 in pg_decode_change (ctx=0x2826118, txn=0x28cbec0,
>> relation=0x7f787a3446a8, change=0x7f786ffe3268) at slony_logical.c:237
>> #10 0x00000000007497d4 in change_cb_wrapper (cache=0x28cbda8, txn=0x28cbec0,
>> relation=0x7f787a3446a8, change=0x7f786ffe3268) at logical.c:704
>>
>>
>>
>> Here is what the code in lookupSlonyInfo is doing
>> ------------------
>>
>> sltable_oid = get_relname_relid("sl_table",slony_namespace);
>>
>> sltable_rel = relation_open(sltable_oid,AccessShareLock);
>> tupdesc=RelationGetDescr(sltable_rel);
>> scandesc=heap_beginscan(sltable_rel,
>> GetCatalogSnapshot(sltable_oid),0,NULL);
>> reloid_attnum = get_attnum(sltable_oid,"tab_reloid");
>>
>> if(reloid_attnum == InvalidAttrNumber)
>> elog(ERROR,"sl_table does not have a tab_reloid column");
>> set_attnum = get_attnum(sltable_oid,"tab_set");
>>
>> if(set_attnum == InvalidAttrNumber)
>> elog(ERROR,"sl_table does not have a tab_set column");
>> tableid_attnum = get_attnum(sltable_oid, "tab_id");
>>
>> if(tableid_attnum == InvalidAttrNumber)
>> elog(ERROR,"sl_table does not have a tab_id column");
>>
>> while( (tuple = heap_getnext(scandesc,ForwardScanDirection) ))
> (Except missing spaces ;)) I don't see anything obviously wrong with
> this.
>
> Greetings,
>
> Andres Freund
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-10-29 02:45:15 Re: Allow peer/ident to fall back to md5?
Previous Message Craig Ringer 2014-10-29 02:22:29 Allow peer/ident to fall back to md5?