Re: BUG #7494: WAL replay speed depends heavily on the shared_buffers size

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Valentine Gogichashvili <valgog(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #7494: WAL replay speed depends heavily on the shared_buffers size
Date: 2012-08-17 11:07:16
Message-ID: 201208171307.16342.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Friday, August 17, 2012 12:51:44 PM Valentine Gogichashvili wrote:
> Hello Andreas,
>
> here are the results of perf profiling:
> https://gist.github.com/3b8cb0c15661da439632
> Also attached the files to that mail.
>
> For a problematic case of big shared_buffers:
>
> # Events: 320K cycles
> #
> # Overhead Command Shared Object Symbol
> # ........ ........ ................. ..................................
> #
> 98.70% postgres postgres [.] DropRelFileNodeBuffers
> 0.18% postgres postgres [.] RecordIsValid
> 0.11% postgres [kernel.kallsyms] [k] native_write_msr_safe
> 0.07% postgres [kernel.kallsyms] [k] dyntick_save_progress_counter
> 0.06% postgres [kernel.kallsyms] [k] scheduler_tick
> 0.03% postgres [kernel.kallsyms] [k] _spin_lock
> 0.03% postgres [kernel.kallsyms] [k] __do_softirq
> 0.03% postgres [kernel.kallsyms] [k] rcu_process_callbacks
> 0.03% postgres postgres [.] hash_search_with_hash_value
> 0.03% postgres [kernel.kallsyms] [k] native_read_msr_safe
> 0.02% postgres libc-2.12.so [.] memcpy
> 0.02% postgres [kernel.kallsyms] [k] rcu_process_gp_end
> 0.02% postgres [kernel.kallsyms] [k] apic_timer_interrupt
> 0.02% postgres [kernel.kallsyms] [k] run_timer_softirq
> 0.02% postgres [kernel.kallsyms] [k] system_call
Ok, that explains it. Youre frequently dropping/truncating tables? That
currently requires walking through shared buffers and loosing all buffered
pages related to that table. That obviously scales linearly with shared
buffers and is particularly expensive on multi socket machines.

Unless youre running 9.3 that will even lock each single page which is a
relatively expensive and slow operation. Depending on how adventurous you are
you could try backporting e8d029a30b5a5fb74b848a8697b1dfa3f66d9697 and see how
big the benefits are for you.

Greetings,

Andres

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message lirex.software 2012-08-19 18:46:55 BUG #7499: wrong data sorting if I use "...limit 1..." SQL clause along with "...order by ..."
Previous Message Andres Freund 2012-08-17 07:31:42 Re: BUG #7494: WAL replay speed depends heavily on the shared_buffers size