Re: Parallel Select query performance and shared buffers

From: Metin Doslu <metin(at)citusdata(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Claudio Freire <klaussfreire(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Parallel Select query performance and shared buffers
Date: 2013-12-04 16:43:35
Message-ID: CAL1dPceswxJMRWR0DqoMas3be147o_Z-nW5siVMEdNBbod=cYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

> I'd strongly suggest doing a "perf record -g -a <wait a bit, ctrl-c>;
> perf report" run to check what's eating up the time.

Here is one example:

+ 38.87% swapper [kernel.kallsyms] [k] hypercall_page
+ 9.32% postgres [kernel.kallsyms] [k] hypercall_page
+ 6.80% postgres [kernel.kallsyms] [k] xen_set_pte_at
+ 5.83% postgres [kernel.kallsyms] [k] copy_user_generic_string
+ 2.06% postgres [kernel.kallsyms] [k] file_read_actor
+ 1.89% postgres postgres [.] heapgettup_pagemode
+ 1.83% postgres postgres [.] hash_search_with_hash_value
+ 1.33% postgres [kernel.kallsyms] [k] get_phys_to_machine
+ 1.25% postgres [kernel.kallsyms] [k] find_get_page
+ 1.00% postgres postgres [.] heapgetpage
+ 0.99% postgres [kernel.kallsyms] [k] radix_tree_lookup_element
+ 0.98% postgres postgres [.] advance_aggregates
+ 0.96% postgres postgres [.] ExecProject
+ 0.94% postgres postgres [.] advance_transition_function
+ 0.88% postgres postgres [.] ExecScan
+ 0.87% postgres postgres [.] HeapTupleSatisfiesMVCC
+ 0.86% postgres postgres [.] LWLockAcquire
+ 0.82% postgres [kernel.kallsyms] [k] put_page
+ 0.82% postgres postgres [.] MemoryContextReset
+ 0.80% postgres postgres [.] SeqNext
+ 0.78% postgres [kernel.kallsyms] [k] pte_mfn_to_pfn
+ 0.69% postgres postgres [.] ExecClearTuple
+ 0.57% postgres postgres [.] ExecProcNode
+ 0.54% postgres postgres [.] heap_getnext
+ 0.53% postgres postgres [.] LWLockRelease
+ 0.53% postgres postgres [.] ExecStoreTuple
+ 0.51% postgres libc-2.12.so [.] __GI___libc_read
+ 0.42% postgres [kernel.kallsyms] [k] xen_spin_lock
+ 0.40% postgres postgres [.] ReadBuffer_common
+ 0.38% postgres [kernel.kallsyms] [k] __do_fault
+ 0.37% postgres [kernel.kallsyms] [k] shmem_fault
+ 0.37% postgres [kernel.kallsyms] [k] unmap_single_vma
+ 0.35% postgres [kernel.kallsyms] [k] __wake_up_bit
+ 0.33% postgres postgres [.] StrategyGetBuffer
+ 0.33% postgres [kernel.kallsyms] [k] set_page_dirty
+ 0.33% postgres [kernel.kallsyms] [k] handle_pte_fault
+ 0.33% postgres postgres [.] ExecAgg
+ 0.31% postgres postgres [.] XidInMVCCSnapshot
+ 0.31% postgres [kernel.kallsyms] [k] __audit_syscall_entry
+ 0.31% postgres postgres [.] CheckForSerializableConflictOut
+ 0.29% postgres [kernel.kallsyms] [k] handle_mm_fault
+ 0.25% postgres [kernel.kallsyms] [k] shmem_getpage_gfp

On Wed, Dec 4, 2013 at 6:33 PM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:

> On 2013-12-04 14:27:10 -0200, Claudio Freire wrote:
> > On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu <metin(at)citusdata(dot)com> wrote:
> > >
> > > Here are the results of "vmstat 1" while running 8 parallel TPC-H
> Simple
> > > (#6) queries: Although there is no need for I/O, "wa" fluctuates
> between 0
> > > and 1.
> > >
> > > procs -----------memory---------- ---swap-- -----io---- --system--
> > > -----cpu-----
> > > r b swpd free buff cache si so bi bo in
> cs us sy id wa st
> > > 0 0 0 30093568 84892 38723896 0 0 0 0 22
> 14 0 0 100 0 0
> > > 8 1 0 30043056 84892 38723896 0 0 0 0 27080
> 52708 16 14 70 0 0
> > > 8 1 0 30006600 84892 38723896 0 0 0 0 44952
> 118286 43 44 12 1 0
> > > 8 0 0 29986264 84900 38723896 0 0 0 20 28043
> 95934 49 42 8 1 0
> > > 7 0 0 29991976 84900 38723896 0 0 0 0 8308
> 73641 52 42 6 0 0
> > > 0 0 0 30091828 84900 38723896 0 0 0 0 3996
> 30978 23 24 53 0 0
> > > 0 0 0 30091968 84900 38723896 0 0 0 0 17
> 23 0 0 100 0 0
> >
> >
> > Notice the huge %sy
>
> My bet is on transparent hugepage defragmentation. Alternatively it's
> scheduler overhead, due to superflous context switches around the buffer
> mapping locks.
>
> I'd strongly suggest doing a "perf record -g -a <wait a bit, ctrl-c>;
> perf report" run to check what's eating up the time.
>
> Greetings,
>
> Andres Freund
>
> --
> Andres Freund http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2013-12-04 16:44:58 Re: Proposed feature: Selective Foreign Keys
Previous Message Metin Doslu 2013-12-04 16:35:25 Re: Parallel Select query performance and shared buffers

Browse pgsql-performance by date

  From Date Subject
Next Message Andres Freund 2013-12-04 16:54:10 Re: Parallel Select query performance and shared buffers
Previous Message Metin Doslu 2013-12-04 16:35:25 Re: Parallel Select query performance and shared buffers