Re: [HACKERS] memory problems in copying large table to STDOUT

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martin Weinberg <weinberg(at)osprey(dot)astro(dot)umass(dot)edu>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] memory problems in copying large table to STDOUT
Date: 1999-10-11 01:33:00
Message-ID: 3580.939605580@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Martin Weinberg <weinberg(at)osprey(dot)astro(dot)umass(dot)edu> writes:
> I got the backtrace with ""Memory exhausted in
> AllocSetAlloc()" this time.

> #4 0x8119732 in MemoryContextAlloc (context=0x81efbd8, size=875628846)
> at mcxt.c:230
> #5 0x810ebf1 in textout (vlena=0x4106182c) at varlena.c:190
> #6 0x808508c in CopyTo (rel=0x822af08, binary=0 '\000', oids=0 '\000',
> fp=0x0, delim=0x813c829 "\t") at copy.c:421

OK, that shoots down the "memory leak" theory. It sure looks like
what you've got is corrupt data: textout is reading a length word of
875628846 (plus or minus a couple bytes) from what is supposed to be
a text datum. Obviously that's not right. Next question is how
it got that way.

I think it's pretty likely that the original cause is the kernel disk
driver or disk hardware flakiness that we already have evidence for.
However, I hate passing the buck like that, so I'm willing to continue
digging if you are.

> Again, I've saved the process in case checking any symbols
> would be helpful.

You should look at the source tuple location info in CopyTo ---
something like
(gdb) f 6 -- frame 6, ie, CopyTo
(gdb) p i -- get column number
(gdb) p *tuple -- print contents of HeapTupleData
(gdb) p *tuple->t_data -- print contents of HeapTupleHeaderData

The last is mainly to find out the tuple's OID for possible future
reference. What we want right now is the tuple location info,
tuple->t_self, which will give us a block number (bi_hi and bi_lo in
that struct are the high and low 16 bits of the block number). Then,
if you can use dd and od to get a hex dump of that block from the
relation's data files, we can see what's really on disk there.
(Remember that the "blocks" are 8K each; also, if you get an offset
beyond 1 gig, then it's going to be in one of the continuation files
"psc.1", "psc.2", etc --- one gig apiece.)

It would also be useful to look at the contents of the disk block as
sitting in memory in the backend, to see if they are the same as what
you read using dd; I would not be too surprised to find they are not.
The t_data pointer should be pointing into a disk buffer in Postgres'
shared memory block, but offhand I'm not sure what's the easiest way to
discover the starting address of that buffer using gdb. (Can any other
hackers lend a hand here?)

regards, tom lane

Browse pgsql-hackers by date

  From Date Subject
Next Message Martin Weinberg 1999-10-11 03:17:21 Re: [HACKERS] memory problems in copying large table to STDOUT
Previous Message Martin Weinberg 1999-10-10 23:37:04 Re: [HACKERS] memory problems in copying large table to STDOUT