Re: Optimize memory allocation code

From: Li Japin <japinli(at)hotmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimize memory allocation code
Date: 2020-09-30 03:42:48
Message-ID: C3945137-31C2-4535-B0C7-4B6B4CEF7FF7@hotmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sep 29, 2020, at 9:30 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com<mailto:alvherre(at)2ndquadrant(dot)com>> wrote:

On 2020-Sep-26, Li Japin wrote:

Thanks! How big is this overhead? Is there any way I can test it?

You could also have a look at the assembly code that your compiler
generates -- particularly examine how it changes.

Thanks for your advice!

The origin assembly code for palloc0 is:

0000000000517690 <palloc0>:
517690: 55 push %rbp
517691: 53 push %rbx
517692: 48 89 fb mov %rdi,%rbx
517695: 48 83 ec 08 sub $0x8,%rsp
517699: 48 81 ff ff ff ff 3f cmp $0x3fffffff,%rdi
5176a0: 48 8b 2d d9 0c 48 00 mov 0x480cd9(%rip),%rbp # 998380 <CurrentMemoryContext>
5176a7: 0f 87 d5 00 00 00 ja 517782 <palloc0+0xf2>
5176ad: 48 8b 45 10 mov 0x10(%rbp),%rax
5176b1: 48 89 fe mov %rdi,%rsi
5176b4: c6 45 04 00 movb $0x0,0x4(%rbp)
5176b8: 48 89 ef mov %rbp,%rdi
5176bb: ff 10 callq *(%rax)
5176bd: 48 85 c0 test %rax,%rax
5176c0: 48 89 c1 mov %rax,%rcx
5176c3: 74 5b je 517720 <palloc0+0x90>
5176c5: f6 c3 07 test $0x7,%bl
5176c8: 75 36 jne 517700 <palloc0+0x70>
5176ca: 48 81 fb 00 04 00 00 cmp $0x400,%rbx
5176d1: 77 2d ja 517700 <palloc0+0x70>
5176d3: 48 01 c3 add %rax,%rbx
5176d6: 48 39 d8 cmp %rbx,%rax
5176d9: 73 35 jae 517710 <palloc0+0x80>
5176db: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
5176e0: 48 83 c0 08 add $0x8,%rax
5176e4: 48 c7 40 f8 00 00 00 movq $0x0,-0x8(%rax)
5176eb: 00
5176ec: 48 39 c3 cmp %rax,%rbx
5176ef: 77 ef ja 5176e0 <palloc0+0x50>
5176f1: 48 83 c4 08 add $0x8,%rsp
5176f5: 48 89 c8 mov %rcx,%rax
5176f8: 5b pop %rbx
5176f9: 5d pop %rbp
5176fa: c3 retq
5176fb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
517700: 48 89 cf mov %rcx,%rdi
517703: 48 89 da mov %rbx,%rdx
517706: 31 f6 xor %esi,%esi
517708: e8 e3 0e ba ff callq b85f0 <memset(at)plt>
51770d: 48 89 c1 mov %rax,%rcx
517710: 48 83 c4 08 add $0x8,%rsp
517714: 48 89 c8 mov %rcx,%rax
517717: 5b pop %rbx
517718: 5d pop %rbp
517719: c3 retq
51771a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
517720: 48 8b 3d 51 0c 48 00 mov 0x480c51(%rip),%rdi # 998378 <TopMemoryContext>
517727: be 64 00 00 00 mov $0x64,%esi
51772c: e8 1f f9 ff ff callq 517050 <MemoryContextStatsDetail>
517731: 31 f6 xor %esi,%esi
517733: bf 14 00 00 00 mov $0x14,%edi
517738: e8 53 6d fd ff callq 4ee490 <errstart>
51773d: bf c5 20 00 00 mov $0x20c5,%edi
517742: e8 99 9b fd ff callq 4f12e0 <errcode>
517747: 48 8d 3d 07 54 03 00 lea 0x35407(%rip),%rdi # 54cb55 <__func__.7554+0x45>
51774e: 31 c0 xor %eax,%eax
517750: e8 ab 9d fd ff callq 4f1500 <errmsg>
517755: 48 8b 55 38 mov 0x38(%rbp),%rdx
517759: 48 8d 3d 80 11 16 00 lea 0x161180(%rip),%rdi # 6788e0 <__func__.6248+0x150>
517760: 48 89 de mov %rbx,%rsi
517763: 31 c0 xor %eax,%eax
517765: e8 56 a2 fd ff callq 4f19c0 <errdetail>
51776a: 48 8d 15 ff 11 16 00 lea 0x1611ff(%rip),%rdx # 678970 <__func__.7326>
517771: 48 8d 3d 20 11 16 00 lea 0x161120(%rip),%rdi # 678898 <__func__.6248+0x108>
517778: be eb 03 00 00 mov $0x3eb,%esi
51777d: e8 0e 95 fd ff callq 4f0c90 <errfinish>
517782: 31 f6 xor %esi,%esi
517784: bf 14 00 00 00 mov $0x14,%edi
517789: e8 02 6d fd ff callq 4ee490 <errstart>
51778e: 48 8d 3d db 10 16 00 lea 0x1610db(%rip),%rdi # 678870 <__func__.6248+0xe0>
517795: 48 89 de mov %rbx,%rsi
517798: 31 c0 xor %eax,%eax
51779a: e8 91 98 fd ff callq 4f1030 <errmsg_internal>
51779f: 48 8d 15 ca 11 16 00 lea 0x1611ca(%rip),%rdx # 678970 <__func__.7326>
5177a6: 48 8d 3d eb 10 16 00 lea 0x1610eb(%rip),%rdi # 678898 <__func__.6248+0x108>
5177ad: be df 03 00 00 mov $0x3df,%esi
5177b2: e8 d9 94 fd ff callq 4f0c90 <errfinish>
5177b7: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
5177be: 00 00

After modified, the palloc0 assembly code is:

0000000000517690 <palloc0>:
517690: 53 push %rbx
517691: 48 89 fb mov %rdi,%rbx
517694: e8 17 ff ff ff callq 5175b0 <palloc>
517699: f6 c3 07 test $0x7,%bl
51769c: 48 89 c1 mov %rax,%rcx
51769f: 75 2f jne 5176d0 <palloc0+0x40>
5176a1: 48 81 fb 00 04 00 00 cmp $0x400,%rbx
5176a8: 77 26 ja 5176d0 <palloc0+0x40>
5176aa: 48 01 c3 add %rax,%rbx
5176ad: 48 39 d8 cmp %rbx,%rax
5176b0: 73 2e jae 5176e0 <palloc0+0x50>
5176b2: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
5176b8: 48 83 c0 08 add $0x8,%rax
5176bc: 48 c7 40 f8 00 00 00 movq $0x0,-0x8(%rax)
5176c3: 00
5176c4: 48 39 c3 cmp %rax,%rbx
5176c7: 77 ef ja 5176b8 <palloc0+0x28>
5176c9: 48 89 c8 mov %rcx,%rax
5176cc: 5b pop %rbx
5176cd: c3 retq
5176ce: 66 90 xchg %ax,%ax
5176d0: 48 89 cf mov %rcx,%rdi
5176d3: 48 89 da mov %rbx,%rdx
5176d6: 31 f6 xor %esi,%esi
5176d8: e8 13 0f ba ff callq b85f0 <memset(at)plt>
5176dd: 48 89 c1 mov %rax,%rcx
5176e0: 48 89 c8 mov %rcx,%rax
5176e3: 5b pop %rbx
5176e4: c3 retq
5176e5: 90 nop
5176e6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
5176ed: 00 00 00

I know why we need the duplication code in palloc0.

--
Best regrads
Japin Li

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-09-30 04:20:25 Re: DROP relation IF EXISTS Docs and Tests - Bug Fix
Previous Message David G. Johnston 2020-09-30 03:38:50 Re: NOTIFY docs fixup - emit and deliver consistency