Re: Vacuum: allow usage of more than 1GB of work mem

From: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Vacuum: allow usage of more than 1GB of work mem
Date: 2016-12-27 13:41:45
Message-ID: d7649878-8d73-20eb-dc60-c26ac4e495d1@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

23.12.2016 22:54, Claudio Freire:
> On Fri, Dec 23, 2016 at 1:39 PM, Anastasia Lubennikova
> <a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
>> I found the reason. I configure postgres with CFLAGS="-O0" and it causes
>> Segfault on initdb.
>> It works fine and passes tests with default configure flags, but I'm pretty
>> sure that we should fix segfault before testing the feature.
>> If you need it, I'll send a core dump.
> I just ran it with CFLAGS="-O0" and it passes all checks too:
>
> CFLAGS='-O0' ./configure --enable-debug --enable-cassert
> make clean && make -j8 && make check-world
>
> A stacktrace and a thorough description of your build environment
> would be helpful to understand why it breaks on your system.

I ran configure using following set of flags:
./configure --enable-tap-tests --enable-cassert --enable-debug
--enable-depend CFLAGS="-O0 -g3 -fno-omit-frame-pointer"
And then ran make check. Here is the stacktrace:

Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000006941e7 in lazy_vacuum_heap (onerel=0x1ec2360,
vacrelstats=0x1ef6e00) at vacuumlazy.c:1417
1417 tblk =
ItemPointerGetBlockNumber(&seg->dead_tuples[tupindex]);
(gdb) bt
#0 0x00000000006941e7 in lazy_vacuum_heap (onerel=0x1ec2360,
vacrelstats=0x1ef6e00) at vacuumlazy.c:1417
#1 0x0000000000693dfe in lazy_scan_heap (onerel=0x1ec2360, options=9,
vacrelstats=0x1ef6e00, Irel=0x1ef7168, nindexes=2, aggressive=1 '\001')
at vacuumlazy.c:1337
#2 0x0000000000691e66 in lazy_vacuum_rel (onerel=0x1ec2360, options=9,
params=0x7ffe0f866310, bstrategy=0x1f1c4a8) at vacuumlazy.c:290
#3 0x000000000069191f in vacuum_rel (relid=1247, relation=0x0,
options=9, params=0x7ffe0f866310) at vacuum.c:1418
#4 0x0000000000690122 in vacuum (options=9, relation=0x0, relid=0,
params=0x7ffe0f866310, va_cols=0x0, bstrategy=0x1f1c4a8,
isTopLevel=1 '\001') at vacuum.c:320
#5 0x000000000068fd0b in vacuum (options=-1652367447, relation=0x0,
relid=3324614038, params=0x1f11bf0, va_cols=0xb59f63,
bstrategy=0x1f1c620, isTopLevel=0 '\000') at vacuum.c:150
#6 0x0000000000852993 in standard_ProcessUtility (parsetree=0x1f07e60,
queryString=0x1f07468 "VACUUM FREEZE;\n",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0xea5cc0
<debugtupDR>, completionTag=0x7ffe0f866750 "") at utility.c:669
#7 0x00000000008520da in standard_ProcessUtility
(parsetree=0x401ef6cd8, queryString=0x18 <error: Cannot access memory at
address 0x18>,
context=PROCESS_UTILITY_TOPLEVEL, params=0x68, dest=0x9e5d62
<AllocSetFree+60>, completionTag=0x7ffe0f8663f0 "`~\360\001")
at utility.c:360
#8 0x0000000000851161 in PortalRunMulti (portal=0x7ffe0f866750,
isTopLevel=0 '\000', setHoldSnapshot=-39 '\331',
dest=0x851161 <PortalRunMulti+19>, altdest=0x7ffe0f8664f0,
completionTag=0x1f07e60 "\341\002") at pquery.c:1219
#9 0x0000000000851374 in PortalRunMulti (portal=0x1f0a488, isTopLevel=1
'\001', setHoldSnapshot=0 '\000', dest=0xea5cc0 <debugtupDR>,
altdest=0xea5cc0 <debugtupDR>, completionTag=0x7ffe0f866750 "") at
pquery.c:1345
#10 0x0000000000850889 in PortalRun (portal=0x1f0a488,
count=9223372036854775807, isTopLevel=1 '\001', dest=0xea5cc0 <debugtupDR>,
altdest=0xea5cc0 <debugtupDR>, completionTag=0x7ffe0f866750 "") at
pquery.c:824
#11 0x000000000084a4dc in exec_simple_query (query_string=0x1f07468
"VACUUM FREEZE;\n") at postgres.c:1113
#12 0x000000000084e960 in PostgresMain (argc=10, argv=0x1e60a50,
dbname=0x1e823b0 "template1", username=0x1e672a0 "anastasia")
at postgres.c:4091
#13 0x00000000006f967e in init_locale (categoryname=0x100000000000000
<error: Cannot access memory at address 0x100000000000000>,
category=32766, locale=0xa004692f0 <error: Cannot access memory at
address 0xa004692f0>) at main.c:310
#14 0x00007f1e5f463830 in __libc_start_main (main=0x6f93e1 <main+85>,
argc=10, argv=0x7ffe0f866a78, init=<optimized out>,
fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=0x7ffe0f866a68) at ../csu/libc-start.c:291
#15 0x0000000000469319 in _start ()

core file is quite big, so I didn't attach it to the mail. You can
download it here: core dump file
<https://drive.google.com/open?id=0B-7gUWL5Lg_gX3VlSXBaZzlKTlk>.

Here are some notes about the first patch:

1. prefetchBlkno = blkno & ~0x1f;
prefetchBlkno = (prefetchBlkno > 32) ? prefetchBlkno - 32 : 0;

I didn't get it what for we need these tricks. How does it differ from:
prefetchBlkno = (blkno > 32) ? blkno - 32 : 0;

2. Why do we decrease prefetchBlckno twice?

Here:
+ prefetchBlkno = (prefetchBlkno > 32) ? prefetchBlkno - 32 : 0;
And here:
if (prefetchBlkno >= 32)
+ prefetchBlkno -= 32;

I'll inspect second patch in a few days and write questions about it.

--
Anastasia Lubennikova
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-12-27 13:54:42 Re: Vacuum: allow usage of more than 1GB of work mem
Previous Message Ashutosh Bapat 2016-12-27 13:24:56 Re: ALTER TABLE parent SET WITHOUT OIDS and the oid column