Re: PSA: New intel MDS vulnerability mitigations cause measurable slowdown

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PSA: New intel MDS vulnerability mitigations cause measurable slowdown
Date: 2019-05-15 01:13:10
Message-ID: 20190515011310.vv6m2ek647imqo3k@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-05-14 15:30:52 -0700, Andres Freund wrote:
> There's a new set of CPU vulnerabilities, so far only affecting intel
> CPUs. Cribbing from the linux-kernel announcement I'm referring to
> https://xenbits.xen.org/xsa/advisory-297.html
> for details.
>
> The "fix" is for the OS to perform some extra mitigations:
> https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html
> https://www.kernel.org/doc/html/latest/x86/mds.html#mds
>
> *And* SMT/hyperthreading needs to be disabled, to be fully safe.
>
> Fun.
>
> I've run a quick pgbench benchmark:
>
> *Without* disabling SMT, for readonly pgbench, I'm seeing regressions
> between 7-11%, depending on the size of shared_buffers (and some runtime
> variations). That's just on my laptop, with an i7-6820HQ / Haswell CPU.
> I'd be surprised if there weren't adversarial loads with bigger
> slowdowns - what gets more expensive with the mitigations is syscalls.

The profile after the mitigations looks like:

+ 3.62% postgres [kernel.vmlinux] [k] do_syscall_64
+ 2.99% postgres postgres [.] _bt_compare
+ 2.76% postgres postgres [.] hash_search_with_hash_value
+ 2.33% postgres [kernel.vmlinux] [k] entry_SYSCALL_64
+ 1.69% pgbench [kernel.vmlinux] [k] do_syscall_64
+ 1.61% postgres postgres [.] AllocSetAlloc
1.41% postgres postgres [.] PostgresMain
+ 1.22% pgbench [kernel.vmlinux] [k] entry_SYSCALL_64
+ 1.11% postgres postgres [.] LWLockAcquire
+ 0.86% postgres postgres [.] PinBuffer
+ 0.80% postgres postgres [.] LockAcquireExtended
+ 0.78% postgres [kernel.vmlinux] [k] psi_task_change
0.76% pgbench pgbench [.] threadRun
0.69% postgres postgres [.] LWLockRelease
+ 0.69% postgres postgres [.] SearchCatCache1
0.66% postgres postgres [.] LockReleaseAll
+ 0.65% postgres postgres [.] GetSnapshotData
+ 0.58% postgres postgres [.] hash_seq_search
0.54% postgres postgres [.] hash_search
+ 0.53% postgres [kernel.vmlinux] [k] __switch_to
+ 0.53% postgres postgres [.] hash_any
0.52% pgbench libpq.so.5.12 [.] pqParseInput3
0.50% pgbench [kernel.vmlinux] [k] do_raw_spin_lock

where do_syscall_64 show this instruction profile:

│ static __always_inline bool arch_static_branch_jump(struct static_key *key, bool branch)
│ {
│ asm_volatile_goto("1:"
1.58 │ ↓ jmpq bd
│ mds_clear_cpu_buffers():
│ * Works with any segment selector, but a valid writable
│ * data segment is the fastest variant.
│ *
│ * "cc" clobber is required because VERW modifies ZF.
│ */
│ asm volatile("verw %[ds]" : : [ds] "m" (ds) : "cc");
77.38 │ verw 0x13fea53(%rip) # ffffffff82400ee0 <ds.4768>
│ do_syscall_64():
│ }

│ syscall_return_slowpath(regs);
│ }
13.18 │ bd: pop %rbx
0.08 │ pop %rbp
│ ← retq
│ nr = syscall_trace_enter(regs);
│ c0: mov %rbp,%rdi
│ → callq syscall_trace_enter

Where verw is the instruction that was recycled to now have the
side-effect of flushing CPU buffers.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2019-05-15 01:46:03 Re: VACUUM fails to parse 0 and 1 as boolean value
Previous Message Andres Freund 2019-05-15 01:06:46 Re: PSA: New intel MDS vulnerability mitigations cause measurable slowdown