Re : Solaris Performance - Profiling

From: Mark kirkwood <markir(at)slingshot(dot)co(dot)nz>
To: pgsql-general(at)postgresql(dot)org
Subject: Re : Solaris Performance - Profiling
Date: 2002-03-29 07:24:30
Message-ID: 1017386671.1274.12.camel@spikey.slithery.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

I am attempting to produce a profile for a slow query run on Solaris -
(c.f previous thread titled "Solaris performance").

I am a bit confused about the the output I am seeing, as the times do not get anywhere close to the 1 hour elapsed time that the query takes....(see "notes"
section at the end for detail on the query)

I will confess to being a bit of a profile newbie... so if I missed something
important, please dont hesitate to put me right.

So anyway I thought I would put down here what I actually had...here goes ...
I tried 2 approaches - tracing via truss and profiling via gprof.

i) Tracing

Before profiling I used truss on the backend process, latter bits of the
output being : (left hand column is the time delta)

0.0003 read(40, "\0\0\0\087 KC4\b\0\0\0\t".., 8192) = 8192
0.0009 lseek(40, 0x14796000, SEEK_SET) = 0x14796000
0.0002 read(40, "\0\0\0\080 t E90\0\0\0\t".., 8192) = 8192
0.0094 read(43, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192) = 8192
0.0154 read(43, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192) = 8192
0.0155 read(43, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192) = 8192
4992.4387 send(7, " P b l a n k\0 T\002 f 3".., 99, 0) = 99
0.0012 recv(7, " Q S E L E C T c u r r".., 8192, 0) = 22
0.0011 lseek(36, 0, SEEK_END) = 8192
0.0013 send(7, " P b l a n k\0 T\001 t i".., 60, 0) = 60
recv(7, 0x002929A8, 8192, 0) (sleeping...)
46994.9585 recv(7, " X", 8192, 0) = 1

Including more librarys in the truss'ing showed thousands of loops, consisting
mainly of memcpy+memcmp, unfortunatly this file is _huge_ ... so I did not take
a copy of it (I am not sending this mail from the Solaris box....).

ii) Profiling

Rebuilt Postgresql with :
CFLAGS=-pg ./configure --disable-shared --prefix=/opt/pgsql

started up the database, ran the query and shut it down again.

A raw profile gmon.out was created, which I ran gprof on.

Call Graph for Top 10

granularity: each sample hit covers 4 byte(s) for 4.17% of 0.24 seconds

called/total parents
index %time self descendents called+self name index
called/total children

0.00 0.23 1/1 main [2]
[1] 95.8 0.00 0.23 1 PostmasterMain [1]
0.00 0.22 1/1 pgstat_start [4]
0.00 0.01 1/1 reset_shared [11]
0.00 0.00 1/1 MemoryContextInit [41]
0.00 0.00 1/1 checkDataDir [42]
0.00 0.00 1/1 CreateOptsFile [43]
0.00 0.00 1/12 AllocSetContextCreate [24]
0.00 0.00 1/1 ProcessConfigFile [46]
0.00 0.00 1/1 StreamServerPort [47]
0.00 0.00 11/25 pqsignal [72]
0.00 0.00 2/2 getenv [131]
0.00 0.00 2/2 _getopt [4862]
0.00 0.00 1/1 _umask [4891]
0.00 0.00 1/5 _getpid [4841]
0.00 0.00 1/1 EnableExceptionHandling [149]
0.00 0.00 1/1 MemoryContextSwitchTo [163]
0.00 0.00 1/1 ResetAllOptions [168]
0.00 0.00 1/1 SetDataDir [170]
0.00 0.00 1/1 IgnoreSystemIndexes [155]
0.00 0.00 1/2 FindExec [122]
0.00 0.00 1/1 CreateDataDirLockFile [143]
0.00 0.00 1/1 RemovePgTempFiles [167]
0.00 0.00 1/1 XLOGPathInit [172]
0.00 0.00 1/1 DLNewList [148]
0.00 0.00 1/1 pqinitmask [193]
0.00 0.00 1/1 _sigprocmask [4884]
0.00 0.00 1/1 pgstat_init [190]

-----------------------------------------------

0.00 0.23 1/1 _start [3]
[2] 95.8 0.00 0.23 1 main [2]
0.00 0.23 1/1 PostmasterMain [1]
0.00 0.00 2/5 _geteuid [4840]
0.00 0.00 1/1 save_ps_display_args [194]
0.00 0.00 1/82 malloc [57]
0.00 0.00 1/49 _strdup [4802]
0.00 0.00 1/2 _getuid [4863]
0.00 0.00 1/137 strlen [55]
0.00 0.00 1/15 strcmp [83]

-----------------------------------------------

<spontaneous>
[3] 95.8 0.00 0.23 _start [3]
0.00 0.23 1/1 main [2]
0.00 0.00 2/4 atexit [103]

-----------------------------------------------

0.00 0.22 1/1 PostmasterMain [1]
[4] 91.7 0.00 0.22 1 pgstat_start [4]
0.02 0.19 1/1 pgstat_main [5]
0.00 0.01 1/2 _fork [15]
0.00 0.00 2/4 fflush [104]
0.00 0.00 1/1 on_exit_reset [188]
0.00 0.00 1/1 ClosePostmasterPorts [142]
0.00 0.00 1/1 exit [179]

-----------------------------------------------

0.02 0.19 1/1 pgstat_start [4]
[5] 89.6 0.02 0.19 1 pgstat_main [5]
0.00 0.17 30497/30497 select [6]
0.02 0.00 30499/30556 _memset [9]
0.00 0.01 1/2 _fork [15]
0.00 0.00 1/1 pgstat_read_statsfile [32]
0.00 0.00 1/1 pgstat_recv_bestart [33]
0.00 0.00 1/8 hash_create [20]
0.00 0.00 4/4 pgstat_write_statsfile [38]
0.00 0.00 1/1 set_ps_display [49]
0.00 0.00 14/25 pqsignal [72]
0.00 0.00 6/6 _gettimeofday [4838]
0.00 0.00 3/8 close [91]
0.00 0.00 3/5 read [102]
0.00 0.00 2/45 .div [61]
0.00 0.00 2/17 .rem [75]
0.00 0.00 1/2 _pipe [4864]
0.00 0.00 1/1 init_ps_display [184]
0.00 0.00 1/82 malloc [57]
0.00 0.00 1/1 pgstat_recv_beterm [191]

-----------------------------------------------

0.00 0.17 30497/30497 pgstat_main [5]
[6] 70.8 0.00 0.17 30497 select [6]
0.00 0.17 30497/30497 _select [8]

-----------------------------------------------

0.17 0.00 30497/30497 _select [8]
[7] 70.8 0.17 0.00 30497 _poll [7]

-----------------------------------------------

0.00 0.17 30497/30497 select [6]
[8] 70.8 0.00 0.17 30497 _select [8]
0.17 0.00 30497/30497 _poll [7]
0.00 0.00 6/53 .mul [59]

-----------------------------------------------

0.00 0.00 1/30556 CLOGShmemInit [44]
0.00 0.00 1/30556 StreamServerPort [47]
0.00 0.00 1/30556 pgstat_read_statsfile [32]
0.00 0.00 1/30556 InitShmemAllocation [45]
0.00 0.00 1/30556 LockMethodTableInit [27]
0.00 0.00 1/30556 set_ps_display [49]
0.00 0.00 2/30556 pgstat_add_backend [31]
0.00 0.00 3/30556 XLOGShmemInit [40]
0.00 0.00 9/30556 getiop [28]
0.00 0.00 12/30556 MemoryContextCreate [25]
0.00 0.00 25/30556 seg_alloc [21]
0.02 0.00 30499/30556 pgstat_main [5]
[9] 8.3 0.02 0.00 30556 _memset [9]

-----------------------------------------------

0.00 0.01 1/1 reset_shared [11]
[10] 4.2 0.00 0.01 1 CreateSharedMemoryAndSemaphores [10]
0.00 0.01 1/1 IpcMemoryCreate [13]
0.00 0.00 1/1 InitLockTable [26]
0.00 0.00 1/1 InitShmemIndex [37]
0.00 0.00 1/1 InitBufferPool [35]
0.00 0.00 1/1 InitFreeSpaceMap [36]
0.00 0.00 1/1 XLOGShmemInit [40]
0.00 0.00 1/1 InitShmemAllocation [45]
0.00 0.00 1/1 CLOGShmemInit [44]
0.00 0.00 1/1 BufferShmemSize [139]
0.00 0.00 1/1 LockShmemSize [162]
0.00 0.00 1/1 XLOGShmemSize [173]
0.00 0.00 1/1 CLOGShmemSize [140]
0.00 0.00 1/2 LWLockShmemSize [127]
0.00 0.00 1/2 SInvalShmemSize [128]
0.00 0.00 1/1 FreeSpaceShmemSize [150]
0.00 0.00 1/1 CreateSpinlocks [147]
0.00 0.00 1/1 CreateLWLocks [144]
0.00 0.00 1/1 InitLocks [157]
0.00 0.00 1/1 InitProcGlobal [158]
0.00 0.00 1/1 CreateSharedInvalidationState [145]
0.00 0.00 1/1 PMSignalInit [164]

-----------------------------------------------

Flat Graph

granularity: each sample hit covers 4 byte(s) for 4.17% of 0.24 seconds

% cumulative self self total
time seconds seconds calls ms/call ms/call name
70.8 0.17 0.17 30497 0.01 0.01 _poll [7]
8.3 0.19 0.02 30556 0.00 0.00 _memset [9]
8.3 0.21 0.02 1 20.00 214.98 pgstat_main [5]
4.2 0.22 0.01 2 5.00 5.00 _libc_fork [16]
4.2 0.23 0.01 2 5.00 5.00 _syscall [17]
4.2 0.24 0.01 internal_mcount [14]
0.0 0.24 0.00 30497 0.00 0.01 _select [8]
0.0 0.24 0.00 30497 0.00 0.01 select [6]
0.0 0.24 0.00 8008 0.00 0.00 LWLockAssign [51]
0.0 0.24 0.00 442 0.00 0.00 strcasecmp [52]

Index by function name

[61] .div [4844] __open64 [4830] _xflsbuf
[59] .mul [4809] __sigaction [174] assign_defaultxacti
[75] .rem [4869] __sigfillset [175] assign_xlog_sync_me
[136] .udiv [4870] _atexit_init [103] atexit
[98] .umul [4821] _cerror [176] atfork_alloc
[63] AllocSetAlloc [4871] _chmod [177] atfork_append
[24] AllocSetContextCrea[4872] _cleanup [79] call_hash
[137] AllocSetFree [4817] _close [42] checkDataDir
[62] AllocSetFreeIndex [4859] _closedir [178] check_nlspath_env
[86] AllocSetInit [4813] _doprnt [56] cleanfree
[39] AllocateFile [4873] _doscan [91] close
[138] BasicOpenFile [4801] _dowrite [115] element_alloc
[139] BufferShmemSize [4825] _endopen [179] exit
[44] CLOGShmemInit [4874] _exithandle [92] fclose
[140] CLOGShmemSize [4810] _ferror_unlocked [180] fcntl
[141] ClearDateCache [4819] _fflush_u [104] fflush
[142] ClosePostmasterPort[4860] _fflush_u_iops [93] find_option
[143] CreateDataDirLockFi[4861] _fileno [29] fopen
[144] CreateLWLocks [4834] _findbuf [80] fputc
[121] CreateLockFile [30] _findiop [181] fputs
[43] CreateOptsFile [15] _fork [130] fread
[145] CreateSharedInvalid[4875] _fprintf [73] free
[10] CreateSharedMemoryA[4814] _free_unlocked [116] free_mem
[146] CreateSocketLockFil[4824] _fstat64 [182] free_name_value_lis
[147] CreateSpinlocks [4831] _fwrite_unlocked [183] fscanf
[148] DLNewList [4835] _getc_unlocked [94] fwrite
[70] DynaHashAlloc [4845] _getdents64 [105] get_mem
[149] EnableExceptionHand[4876] _getegid [131] getenv
[122] FindExec [4840] _geteuid [28] getiop
[109] FreeFile [4877] _getgid [20] hash_create
[150] FreeSpaceShmemSize [4862] _getopt [100] hash_estimate_size
[151] GUC_yy_create_buffe[4841] _getpid [81] hash_search
[123] GUC_yy_flex_alloc [4838] _gettimeofday [101] hash_select_dirsize
[124] GUC_yy_flush_buffer[4863] _getuid [89] hash_seq_init
[125] GUC_yy_get_next_buf[4826] _ioctl [76] hash_seq_search
[152] GUC_yy_get_previous[4827] _isatty [95] hdefault
[126] GUC_yy_init_buffer [4878] _libc_cond_init [22] init_htab
[110] GUC_yy_load_buffer_[4853] _libc_fcntl [184] init_ps_display
[53] GUC_yylex [16] _libc_fork [14] internal_mcount
[153] GUC_yyrestart [4815] _libc_mutex_init [185] issock
[154] GUC_yywrap [4822] _libc_open [186] keys_destruct
[155] IgnoreSystemIndexes[4846] _libc_open64 [2] main
[34] InitBufTable [4854] _libc_rwlock_init [57] malloc
[35] InitBufferPool [4811] _libc_sigaction [187] mem_init
[156] InitFreeList [4879] _libc_thr_keycreate [54] mutex_unlock
[36] InitFreeSpaceMap [4880] _lseek [68] my_log2
[26] InitLockTable [4842] _lseek64 [132] number
[157] InitLocks [4799] _malloc_unlocked [67] nvmatch
[158] InitProcGlobal [4881] _memccpy [188] on_exit_reset
[45] InitShmemAllocation[4798] _memcpy [117] on_proc_exit
[37] InitShmemIndex [9] _memset [118] on_shmem_exit
[12] InternalIpcMemoryCr[4818] _morecore [106] open
[111] InternalIpcSemaphor[4797] _mutex_lock [96] parse_int
[159] IpcInitKeyAssignmen[4847] _opendir [189] pfree
[13] IpcMemoryCreate [4864] _pipe [31] pgstat_add_backend
[112] IpcSemaphoreCreate [7] _poll [190] pgstat_init
[113] IpcSemaphoreUnlock [4882] _private_sigprocmas [5] pgstat_main
[84] LWLockAcquire [4883] _profil [32] pgstat_read_statsfi
[51] LWLockAssign [4848] _putc_unlocked [33] pgstat_recv_bestart
[85] LWLockRelease [4832] _read [191] pgstat_recv_beterm
[127] LWLockShmemSize [4828] _readdir [4] pgstat_start
[160] LockMethodInit [4829] _readdir64 [192] pgstat_sub_backend
[27] LockMethodTableInit[4805] _realbufend [38] pgstat_write_statsf
[161] LockMethodTableRena[4849] _rename [193] pqinitmask
[162] LockShmemSize [4806] _return_negone [72] pqsignal
[64] MemoryContextAlloc [4796] _return_zero [82] putc
[25] MemoryContextCreate[4803] _sbrk [102] read
[41] MemoryContextInit [4804] _sbrk_unlocked [60] realfree
[163] MemoryContextSwitch [8] _select [133] realloc
[114] NumLWLocks [4839] _semctl [107] rename
[164] PMSignalInit [4855] _semget [11] reset_shared
[1] PostmasterMain [4856] _semop [194] save_ps_display_arg
[46] ProcessConfigFile [4823] _semsys [21] seg_alloc
[165] ReadControlFile [4836] _setbufend [6] select
[166] RecordSharedMemoryI[4837] _setorientation [97] set_config_option
[167] RemovePgTempFiles [18] _shmat [49] set_ps_display
[168] ResetAllOptions [19] _shmget [74] sigvalid
[169] SIBufferInit [4812] _sigaction [77] snprintf
[128] SInvalShmemSize [4816] _sigdelset [99] sprintf
[170] SetDataDir [4808] _sigemptyset [52] strcasecmp
[65] ShmemAlloc [4865] _sigfillset [134] strcat
[23] ShmemInitHash [4884] _sigprocmask [135] strchr
[87] ShmemInitStruct [4800] _smalloc [83] strcmp
[171] StreamClose [4866] _so_bind [58] strcpy
[47] StreamServerPort [4885] _so_connect [88] string_hash
[129] ValidateBinary [4886] _so_getsockname [55] strlen
[48] ValidatePgVersion [4887] _so_getsockopt [78] strncpy
[172] XLOGPathInit [4888] _so_listen [108] strrchr
[40] XLOGShmemInit [4889] _so_setsockopt [90] strtol
[173] XLOGShmemSize [4857] _stat [195] t_delete
[4807] ___errno [4802] _strdup [119] tag_hash
[4868] __doscan_u [17] _syscall [66] 1:tas
[4851] __fcntl [4890] _sysconfig [71] 1:tas
[4852] __filbuf [4891] _umask [69] thr_main
[4843] __flsbuf [4867] _unlink [120] write
[4858] __mul64 [4833] _write [50] <cycle 1>
[4820] __open [4850] _wrtchk

Object modules

1: /opt/pgsql/bin/postmaster

Notes
-----

i) Background

This query takes 30s on an old, slow Intel box (333Mhz) running Freebsd.
It takes about 1 hour on this Sparc box.
This suggested something not quite right somewhere.

ii) Platform Info

Sun E220 2x450Mhz Utlrasparc CPU + 2G memory.
OS + Postgresql is installed on 2x36G 15000 RPM Seagate Cheetah drives
Remaining IO subsystem is fibre to a Compaq SAN (RAID 5).
The Postgresql database files are located on this SAN system.

iii) Version Info

db1=# SELECT version();
version
----------------------------------------------------------------
PostgreSQL 7.2 on sparc-sun-solaris2.8, compiled by GCC 2.95.2
(1 row)

iv) Tables And Indexes

CREATE TABLE dim0 ( d0key INTEGER,
f1 TIMESTAMP,
f2 VARCHAR(20),
f3 VARCHAR(20)
);

CREATE TABLE fact0 ( d0key INTEGER,
d1key INTEGER,
d2key INTEGER,
val INTEGER,
filler TEXT
);

CREATE UNIQUE INDEX dim0_pk ON dim0 (d0key);
CREATE UNQIUE INDEX dim0_q1 ON dim0 (f1);
CREATE INDEX fact0_q1 ON fact0(d0key) ;

db1=# SELECT relname,tuples FROM pg_class WHERE relname IN
('dim0','fact0');

relname | reltuples
---------+-----------
dim0 | 10080
fact0 | 1e+07
(2 rows)

v) Query Access Plan And Elapsed Time

db1=# EXPLAIN
SELECT
d0.f3,
count(f.val)
FROM dim0 d0,
fact0 f
WHERE d0.d0key = f.d0key
AND d0.f1 between '1999-12-01' AND '2000-02-29'
GROUP BY d0.f3
;

Aggregate (cost=27560.60..27594.10 rows=670 width=18)
-> Group (cost=27560.60..27577.35 rows=6700 width=18)
-> Sort (cost=27560.60..27560.60 rows=6700 width=18)
-> Nested Loop (cost=0.00..27134.84 rows=6700 width=18)
-> Index Scan using dim0_q1 on dim0 d0 (cost=0.00..4.40
rows=87 width=10)
-> Index Scan using fact0_q1 on fact0 f (cost=0.00..310.06
rows=77 width=8)

EXPLAIN
db1=# \i qtype1.sql
timetz
--------------------
18:54:43.366466+12
(1 row)

f3 | count
----+-------
01 | 30000
02 | 28000
12 | 30000
(3 rows)

timetz
--------------------
20:18:01.102664+12
(1 row)

vi) Non Default Postgresql Server Parameters

shared_buffers=4000
sort_mem=10240
wal_buffers=100
wal_files=10

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2002-03-29 14:44:28 Re: Re : Solaris Performance - Profiling
Previous Message Martin Dillard 2002-03-29 01:10:29 Great Bridge RDBMS comparisons

Browse pgsql-hackers by date

  From Date Subject
Next Message Teodor Sigaev 2002-03-29 10:27:54 Bug or feature in 7.3?
Previous Message Bruce Momjian 2002-03-29 05:54:09 Re: Mailing List Question