RE: [Bus error] huge_pages default value (try) not fall back

From: Fan Liu <fan(dot)liu(at)ericsson(dot)com>
To: Odin Ugedal <odin(at)ugedal(dot)com>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: RE: [Bus error] huge_pages default value (try) not fall back
Date: 2020-06-10 01:28:57
Message-ID: VI1PR0702MB372637DB70D9AAB11124891A9E830@VI1PR0702MB3726.eurprd07.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Thank you so much for the information.

BRs,
Fan Liu
ADP Document Database PG

>>-----Original Message-----
>>From: Odin Ugedal <odin(at)ugedal(dot)com>
>>Sent: 2020年6月9日 23:23
>>To: Fan Liu <fan(dot)liu(at)ericsson(dot)com>
>>Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>; PostgreSQL mailing lists
>><pgsql-bugs(at)lists(dot)postgresql(dot)org>
>>Subject: Re: [Bus error] huge_pages default value (try) not fall back
>>
>>Hi,
>>
>>I stumbled upon this issue when working with the related issue in Kubernetes
>>that was referenced a few mails behind. So from what I understand, it looks
>>like this issue is/may be a result of how hugetlb cgroup is enforcing the
>>"limit_in_bytes" limit for huge pages. A process should theoretically don't
>>segfault like this under normal circumstances when using memory received from
>>a successful mmap. The value set to "limit_in_bytes" is only enforced during
>>page allocation, and _not_ when mapping pages using mmap. This results in a
>>successful mmap for -n- huge pages as long as the system has -n- free hugepages,
>>even though the size is bigger than "limit_in_bytes". The process then reserves
>>the huge page memory, and makes it inaccessible to other processes.
>>
>>The real issue is when postgres tries to write to the memory it received from
>>mmap, and the kernel tries to allocate the reserved huge page memory. Since
>>it is not allowed to do so by the cgroup, the process segfaults.
>>
>>This issue has been fixed in Linux this patch
>>https://protect2.fireeye.com/v1/url?k=41942750-1f34c7c4-419467cb-86d2114ea
>>b2f-4c9655dbe24776b3&q=1&e=4467c237-1149-49f1-ab6c-bc0a3c31b0f3&u=https%3A
>>%2F%2Flkml.org%2Flkml%2F2020%2F2%2F3%2F1153, that adds a new element of
>>control to the cgroup that will fix this issue. There are however no container
>>runtimes that use it yet, and only 5.7+ (afaik.) kernels support it, but the
>>progress can be tracked here:
>>https://protect2.fireeye.com/v1/url?k=8e01d669-d0a136fd-8e0196f2-86d2114ea
>>b2f-dd1ff954a0920218&q=1&e=4467c237-1149-49f1-ab6c-bc0a3c31b0f3&u=https%3A
>>%2F%2Fgithub.com%2Fopencontainers%2Fruntime-spec%2Fissues%2F1050. The fix
>>for the upstream Kubernetes issue
>>(https://protect2.fireeye.com/v1/url?k=5d33f1ab-0393113f-5d33b130-86d2114e
>>ab2f-38b5ca047e5124c3&q=1&e=4467c237-1149-49f1-ab6c-bc0a3c31b0f3&u=https%3
>>A%2F%2Fgithub.com%2Fopencontainers%2Fruntime-spec%2Fissues%2F1050) that made
>>kubernetes set wrong value to the top level "limit_in_bytes" when the
>>pre-allocated page count increased after kubernetes (kubelet) startup, will
>>hopefully land in Kubernetes 1.19 (or 1.20). Fingers crossed!
>>
>>Hopefully this makes some sense, and gives some insights into the issue...
>>
>>Best regards,
>>Odin Ugedal

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Vianello Fabio 2020-06-10 06:02:54 RE: BUG #16481: Stored Procedure Triggered by Logical Replication is Unable to use Notification Events
Previous Message Jehan-Guillaume de Rorthais 2020-06-09 22:29:33 Re: BUG #15285: Query used index over field with ICU collation in some cases wrongly return 0 rows