Re: Vacuum: allow usage of more than 1GB of work mem

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Cc: Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Vacuum: allow usage of more than 1GB of work mem
Date: 2016-12-28 18:41:20
Message-ID: CAGTBQpbev0AjqZE9qw8=kvFYH3-6urkDBPkPjBqRjra9-ASiug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 28, 2016 at 10:26 AM, Anastasia Lubennikova
<a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
> 27.12.2016 20:14, Claudio Freire:
>
> On Tue, Dec 27, 2016 at 10:41 AM, Anastasia Lubennikova
> <a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
>
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0 0x00000000006941e7 in lazy_vacuum_heap (onerel=0x1ec2360,
> vacrelstats=0x1ef6e00) at vacuumlazy.c:1417
> 1417 tblk =
> ItemPointerGetBlockNumber(&seg->dead_tuples[tupindex]);
> (gdb) bt
> #0 0x00000000006941e7 in lazy_vacuum_heap (onerel=0x1ec2360,
> vacrelstats=0x1ef6e00) at vacuumlazy.c:1417
> #1 0x0000000000693dfe in lazy_scan_heap (onerel=0x1ec2360, options=9,
> vacrelstats=0x1ef6e00, Irel=0x1ef7168, nindexes=2, aggressive=1 '\001')
> at vacuumlazy.c:1337
> #2 0x0000000000691e66 in lazy_vacuum_rel (onerel=0x1ec2360, options=9,
> params=0x7ffe0f866310, bstrategy=0x1f1c4a8) at vacuumlazy.c:290
> #3 0x000000000069191f in vacuum_rel (relid=1247, relation=0x0, options=9,
> params=0x7ffe0f866310) at vacuum.c:1418
>
> Those line numbers don't match my code.
>
> Which commit are you based on?
>
> My tree is (currently) based on 71f996d22125eb6cfdbee6094f44370aa8dec610
>
>
> Hm, my branch is based on 71f996d22125eb6cfdbee6094f44370aa8dec610 as well.
> I merely applied patches
> 0001-Vacuum-prefetch-buffers-on-backward-scan-v3.patch
> and 0002-Vacuum-allow-using-more-than-1GB-work-mem-v3.patch
> then ran configure and make as usual.
> Am I doing something wrong?

Doesn't sound wrong. Maybe it's my tree the unclean one. I'll have to
do a clean checkout to verify.

> Anyway, I found the problem that had caused segfault.
>
> for (segindex = 0; segindex <= vacrelstats->dead_tuples.last_seg; tupindex =
> 0, segindex++)
> {
> DeadTuplesSegment *seg =
> &(vacrelstats->dead_tuples.dead_tuples[segindex]);
> int num_dead_tuples = seg->num_dead_tuples;
>
> while (tupindex < num_dead_tuples)
> ...
>
> You rely on the value of tupindex here, while during the very first pass the
> 'tupindex' variable
> may contain any garbage. And it happend that on my system there was negative
> value
> as I found inspecting core dump:
>
> (gdb) info locals
> num_dead_tuples = 5
> tottuples = 0
> tupindex = -1819017215
>
> Which leads to failure in the next line
> tblk = ItemPointerGetBlockNumber(&seg->dead_tuples[tupindex]);
>
> The solution is to move this assignment inside the cycle.

Good catch. I read that line suspecting that very same thing but
somehow I was blind to it.

> I've read the second patch.
>
> 1. What is the reason to inline vac_cmp_itemptr() ?

Performance, mostly. By inlining some transformations can be applied
that wouldn't be possible otherwise. During the binary search, this
matters performance-wise.

> 2. +#define LAZY_MIN_TUPLES Max(MaxHeapTuplesPerPage, (128<<20) /
> sizeof(ItemPointerData))
> What does 128<<20 mean? Why not 1<<27? I'd ask you to replace it with named
> constant,
> or at least add a comment.

I tought it was more readable like that, since 1<<20 is known to be
"MB", that reads as "128 MB".

I guess I can add a comment saying so.

> I'll share my results of performance testing it in a few days.

Thanks

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2016-12-28 18:51:31 Re: merging some features from plpgsql2 project
Previous Message Pavel Stehule 2016-12-28 18:37:42 Re: proposal: session server side variables