Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)
Date: 2013-01-13 19:43:50
Message-ID: 20130113194350.GF26173@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-01-13 14:17:52 -0500, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > the numbers are:
> > old definition: 10393.658ms, 5497912 bytes
> > old definition + unreachable: 10011.102ms, 5469144 bytes
> > stmt, two calls, unreachable: 10036.132ms, 5468792 bytes
> > stmt, one call, unreachable: 9443.612ms, 5462232 bytes
> > stmt, one call, unreachable, save errno: 9615.863ms, 5489688 bytes
>
> I find these numbers pretty hard to credit. Why should replacing two
> calls by one, in code paths that are not being taken, move the runtime
> so much? The argument that a net reduction of code size is a win
> doesn't work, because the last case is more code than any except the
> first.

There are quite some elog(DEBUG*)s in the backend, and those are taken,
so I don't find it unreasonable that doing less work in those cases is
measurable.

And if you look at the disassembly of ERROR codepaths:

saving errno, one function:
Dump of assembler code for function palloc:
639 {
0x0000000000736397 <+7>: mov %rdi,%rsi
0x00000000007363b0 <+32>: push %rbp
0x00000000007363b1 <+33>: mov %rsp,%rbp
0x00000000007363b4 <+36>: sub $0x20,%rsp

640 /* duplicates MemoryContextAlloc to
avoid increased overhead */
641 AssertArg(MemoryContextIsValid(CurrentMemoryContext));
642
643 if (!AllocSizeIsValid(size))
0x0000000000736390 <+0>: cmp $0x3fffffff,%rdi
0x000000000073639a <+10>: ja 0x7363b0 <palloc+32>

644 elog(ERROR, "invalid memory alloc
request size %lu",
0x00000000007363b8 <+40>: mov %rdi,-0x8(%rbp)
0x00000000007363bc <+44>: callq 0x462010 <__errno_location(at)plt>
0x00000000007363c1 <+49>: mov -0x8(%rbp),%rsi
0x00000000007363c5 <+53>: mov $0x8ace30,%r9d
0x00000000007363cb <+59>: mov $0x14,%ecx
0x00000000007363d0 <+64>: mov $0x8acefe,%edx
0x00000000007363d5 <+69>: mov $0x8ace58,%edi
0x00000000007363da <+74>: mov %rsi,(%rsp)
0x00000000007363de <+78>: mov (%rax),%r8d
0x00000000007363e1 <+81>: mov $0x285,%esi
0x00000000007363e6 <+86>: xor %eax,%eax
0x00000000007363e8 <+88>: callq 0x71a0b0 <elog_all>
0x00000000007363ed: nopl (%rax)
645 (unsigned long) size);
646
647 CurrentMemoryContext->isReset = false;
0x000000000073639c <+12>: mov 0x25c6e5(%rip),%rdi
# 0x992a88 <CurrentMemoryContext>
0x00000000007363a7 <+23>: movb $0x0,0x30(%rdi)

648
649 return (*CurrentMemoryContext->methods->alloc)
(CurrentMemoryContext, size);
0x00000000007363a3 <+19>: mov 0x8(%rdi),%rax
0x00000000007363ab <+27>: mov (%rax),%rax
0x00000000007363ae <+30>: jmpq *%rax

End of assembler dump.

vs

Dump of assembler code for function palloc:
639 {
0x00000000007382b0 <+0>: push %rbp
0x00000000007382b1 <+1>: mov %rsp,%rbp
0x00000000007382b4 <+4>: push %rbx
0x00000000007382b5 <+5>: mov %rdi,%rbx
0x00000000007382b8 <+8>: sub $0x8,%rsp

640 /* duplicates MemoryContextAlloc to
avoid increased overhead */
641 AssertArg(MemoryContextIsValid(CurrentMemoryContext));
642
643 if (!AllocSizeIsValid(size))
0x00000000007382bc <+12>: cmp $0x3fffffff,%rdi
0x00000000007382c3 <+19>: jbe 0x7382ed <palloc+61>

644 elog(ERROR, "invalid memory alloc
request size %lu",
0x00000000007382c5 <+21>: mov $0x8aec9e,%edx
0x00000000007382ca <+26>: mov $0x284,%esi
0x00000000007382cf <+31>: mov $0x8aebd0,%edi
0x00000000007382d4 <+36>: callq 0x71bcb0 <elog_start>
0x00000000007382d9 <+41>: mov %rbx,%rdx
0x00000000007382dc <+44>: mov $0x8aec10,%esi
0x00000000007382e1 <+49>: mov $0x14,%edi
0x00000000007382e6 <+54>: xor %eax,%eax
0x00000000007382e8 <+56>: callq 0x71bac0 <elog_finish>

645 (unsigned long) size);
646
647 CurrentMemoryContext->isReset = false;
0x00000000007382ed <+61>: mov 0x25bc54(%rip),%rdi
# 0x993f48 <CurrentMemoryContext>
0x00000000007382fb <+75>: movb $0x0,0x30(%rdi)

648
649 return (*CurrentMemoryContext->methods->alloc)
(CurrentMemoryContext, size);
0x00000000007382f4 <+68>: mov %rbx,%rsi
0x00000000007382f7 <+71>: mov 0x8(%rdi),%rax
0x00000000007382ff <+79>: mov (%rax),%rax
0x0000000000738308 <+88>: jmpq *%rax
0x000000000073830a: nopw 0x0(%rax,%rax,1)

650 }
0x0000000000738302 <+82>: add $0x8,%rsp
0x0000000000738306 <+86>: pop %rbx
0x0000000000738307 <+87>: pop %rbp
End of assembler dump.

If other critical paths look like this its not that surprising.

> I think you're measuring some coincidental effect or other, not a
> reproducible performance improvement. Or there's a bug in the code
> you're using.

Might be, I repeated all of it though.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2013-01-13 20:04:31 Re: Re: logical changeset generation v3 - comparison to Postgres-R change set format
Previous Message Tom Lane 2013-01-13 19:17:52 Re: Re: [PATCH] unified frontend support for pg_malloc et al and palloc/pfree mulation (was xlogreader-v4)