From: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
---|---|
To: | "Little, Douglas" <DOUGLAS(dot)LITTLE(at)orbitz(dot)com> |
Cc: | "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: Transaction ID overrun problem on greenplum |
Date: | 2012-05-18 17:01:06 |
Message-ID: | CAOR=d=3p6LzDbd=XDdow6YvOz1mmUis9wm+b55Qtcn+pqAOgAQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Can you run db wide vacuums? That's what I'd try first.
On Fri, May 18, 2012 at 10:41 AM, Little, Douglas
<DOUGLAS(dot)LITTLE(at)orbitz(dot)com> wrote:
> Hello,
>
> Apologies in advance for asking here and not on the greenplum forum, but
> I’m asking here because I think the PG community understands pg better than
> Greenplum.
>
>
>
> We run Greenplum 4.1 which is based on PG 8.2.15
>
> We recently had a problem where the transaction ID overflowed an internal
> buffer, causing queries to hang and randomly fail.
>
> Greenplum provided a fix where they increased the size of the buffer.
>
> Greenplum has disabled the Autovacuum features. I suspect because they have
> no method of coordinating the autovacuums across the nodes.
>
>
>
> I’ve reviewed the PG doc on transaction id and
>
> I’m trying to understand if there’s something we are doing wrong.
>
> Practices concerning me
>
> 1. We never vacuum the entire db – we’re doing the system catalog
> tables 3x/day and user tables at least 1x/w- although we don’t have a method
> of identifying tables that have not been vacuumed so it’s possible we’re
> missing some table.
>
> 2. In the pg_clog directories we have 25 files dating back to aug
> 2011. When the instances start, the logs are filled with twophase location
> message. Are these expected?
>
> 3. Is the overflow really being caused by transaction wrap around?
>
>
>
> Would someone comment on these concerns.
>
> Thanks
>
>
>
>
>
> Guk setting
>
> vacuum_freeze_min_age 100,000,000 Minimum age at which
> VACUUM should freeze a table row.
>
>
>
>
>
> Log failure message
>
>
>
> 2012-05-10 16:26:24.425750
> CDT,"cognos_cube","p1gp1",p25497,th2121975200,"172.28.8.250","18959",2012-05-10
> 16:25:03
> CDT,100000245,con2170,,seg24,,,x100000245,sx1,"LOG","00000","FinishPreparedTransaction:
> found TwoPhaseState entry for 99999948",,,,,,,0,,"twophase.c",1368,
>
> 2012-05-10 16:26:24.739536
> CDT,"infa_mktldr","p1gp1",p27945,th2121975200,"172.28.8.250","27576",2012-05-10
> 16:26:15
> CDT,100000246,con2240,,seg24,,,x100000246,sx1,"LOG","00000","FinishPreparedTransaction:
> found TwoPhaseState entry for 100000230",,,,,,,0,,"twophase.c",1368,
>
> 2012-05-10 16:26:33.859255
> CDT,"infa_read","p1gp1",p27981,th2121975200,"172.28.8.250","27866",2012-05-10
> 16:26:18
> CDT,100000256,con2244,,seg24,,,x100000256,sx1,"LOG","00000","FinishPreparedTransaction:
> found TwoPhaseState entry for 100000236",,,,,,,0,,"twophase.c",1368,
>
> 2012-05-10 16:26:38.564872
> CDT,"ods","p1gp1",p27416,th2121975200,"172.28.8.250","23337",2012-05-10
> 16:26:06
> CDT,100000257,con2212,,seg24,,dx4860103,x100000257,sx1,"LOG","00000","
>
> Unexpected internal error: Segment process 27416 received signal SIGSEGV
>
>
>
> ",,,,,,,0,,,,
>
> 2012-05-10 16:26:41.359615
> CDT,"ods","p1gp1",p27416,th2121975200,"172.28.8.250","23337",2012-05-10
> 16:26:06
> CDT,100000257,con2212,,seg24,,dx4860103,x100000257,sx1,"PANIC","XX000","Unexpected
> internal error: Segment process received signal SIGSEGV
> (postgres.c:3384)",,,,,,,0,,"postgres.c",3384,"Stack trace:
>
> 1 0xa39555 postgres errstart (elog.c:454)
>
> 2 0x8e26e5 postgres <symbol not found> (postgres.c:3380)
>
> 3 0x38c70302d0 libc.so.6 <symbol not found> (??:0)
>
> 4 0x8b1578 postgres ProcArrayAdd (procarray.c:149)
>
> 5 0x5258a5 postgres EndPrepare (twophase.c:1240)
>
> 6 0x4f3462 postgres <symbol not found> (xact.c:3094)
>
> 7 0x4f3a75 postgres CommitTransactionCommand (xact.c:3523)
>
> 8 0xb944a5 postgres performDtxProtocolCommand (cdbtm.c:3891)
>
> 9 0x8e6552 postgres PostgresMain (postgres.c:1414)
>
> 10 0x8516b1 postgres <symbol not found> (postmaster.c:6443)
>
> 11 0x8589a5 postgres PostmasterMain (postmaster.c:2272)
>
> 12 0x76713a postgres main (main.c:212)
>
> 13 0x38c701d994 libc.so.6 __libc_start_main (??:0)
>
> 14 0x475089 postgres <symbol not found> (??:0)
>
> "
>
> 2012-05-10 16:26:46.580196
> CDT,,,p25872,th2121975200,,,,0,,,seg-1,,,,,"LOG","00000","server process
> (PID 27416) was terminated by signal 6:
> Aborted",,,,,,,0,,"postmaster.c",5275,
>
>
>
> Log messages regarding twophase
>
> 2012-05-18 00:05:49.849381
> CDT,"infa_write","p1gp1",p15767,th-1760766560,"172.28.8.250","33445",2012-05-18
> 00:05:29
> CDT,104342361,con385826,,seg0,,,x104342361,sx1,"LOG","00000","FinishPreparedTransaction:
> found TwoPhaseState entry for 104342242",,,,,,,0,,"twophase.c",1368,
>
> 2012-05-18 00:05:50.018299
> CDT,"infa_write","p1gp1",p15887,th-1760766560,"172.28.8.250","34675",2012-05-18
> 00:05:32
> CDT,104342362,con385833,,seg0,,,x104342362,sx1,"LOG","00000","FinishPreparedTransaction:
> found TwoPhaseState entry for 104342257",,,,,,,0,,"twophase.c",1368,
>
> 2012-05-18 00:05:50.029302
> CDT,"infa_write","p1gp1",p15839,th-1760766560,"172.28.8.250","34335",2012-05-18
> 00:05:30
> CDT,104342363,con385830,,seg0,,,x104342363,sx1,"LOG","00000","FinishPreparedTransaction:
> found TwoPhaseState entry for 104342250",,,,,,,0,,"twophase.c",1368,
>
> 2012-05-18 00:05:52.274591
> CDT,"infa_write","p1gp1",p15959,th-1760766560,"172.28.8.250","35303",2012-05-18
> 00:05:36
> CDT,104342369,con385840,,seg0,,,x104342369,sx1,"LOG","00000","FinishPreparedTransaction:
> found TwoPhaseState entry for 104342277",,,,,,,0,,"twophase.c",1368,
>
> 2012-05-18 00:05:52.277127
> CDT,"infa_write","p1gp1",p15478,th-1760766560,"172.28.8.250","31184",2012-05-18
> 00:05:24
> CDT,104342370,con385813,,seg0,,,x104342370,sx1,"LOG","00000","FinishPreparedTransaction:
> found TwoPhaseState entry for 104342346",,,,,,,0,,"twophase.c",1368,
>
> 2012-05-18 00:05:52.280285
> CDT,"infa_write","p1gp1",p15535,th-1760766560,"172.28.8.250","31304",2012-05-18
> 00:05:25
> CDT,104342371,con385816,,seg0,,,x104342371,sx1,"LOG","00000","FinishPreparedTransaction:
> found TwoPhaseState entry for 104342347",,,,,,,0,,"twophase.c",1368,
>
>
>
> Thanks
>
> Doug Little
>
>
>
>
--
To understand recursion, one must first understand recursion.
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Bruccoleri | 2012-05-18 21:31:49 | Problem with SSL certificate setup |
Previous Message | Little, Douglas | 2012-05-18 16:41:02 | Transaction ID overrun problem on greenplum |