Skip site navigation (1) Skip section navigation (2)

Re: [COMMITTERS] pgsql-server/ /configure /configure.in rc/incl ...

From: Sean Chittenden <sean(at)chittenden(dot)org>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>,Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>,PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [COMMITTERS] pgsql-server/ /configure /configure.in rc/incl ...
Date: 2003-03-07 06:04:12
Message-ID: 20030307060412.GA19138@perrin.int.nxad.com (view raw or flat)
Thread:
Lists: pgsql-committerspgsql-performance
> > I don't have my copy of Steven's handy (it's some 700mi away atm
> > otherwise I'd cite it), but if Tom or someone else has it handy, look
> > up the example re: the performance gain from read()'ing an mmap()'ed
> > file versus a non-mmap()'ed file.  The difference is non-trivial and
> > _WELL_ worth the time given the speed increase.
> 
> Can anyone confirm this? If so, one easy step we could take in this
> direction would be adapting COPY FROM to use mmap().

Weeee!  Alright, so I got to have some fun writing out some simple
tests with mmap() and friends tonight.  Are the results interesting?
Absolutely!  Is this a simple benchmark?  Yup.  Do I think it
simulates PostgreSQL?  Eh, not particularly.  Does it demonstrate that
mmap() is a win and something worth implementing?  I sure hope so.  Is
this a test program to demonstrate the ideal use of mmap() in
PostgreSQL?  No.  Is it a place to start a factual discussion?  I hope
so.

I have here four tests that are conditionalized by cpp.

# The first one uses read() and write() but with the buffer size set
# to the same size as the file.
gcc -O3 -finline-functions -fkeep-inline-functions -funroll-loops  -o test-mmap test-mmap.c
/usr/bin/time ./test-mmap > /dev/null
Beginning tests with file:              services

Page size:                              4096
File read size is the same as the file size
Number of iterations:                   100000
Start time:                             1047013002.412516
Time:                                   82.88178

Completed tests
       82.09 real         2.13 user        68.98 sys

# The second one uses read() and write() with the default buffer size:
# 65536
gcc -O3 -finline-functions -fkeep-inline-functions -funroll-loops  -DDEFAULT_READSIZE=1 -o test-mmap test-mmap.c
/usr/bin/time ./test-mmap > /dev/null
Beginning tests with file:              services

Page size:                              4096
File read size is default read size:    65536
Number of iterations:                   100000
Start time:                             1047013085.16204
Time:                                   18.155511

Completed tests
       18.16 real         0.90 user        14.79 sys
# Please note this is significantly faster, but that's expected

# The third test uses mmap() + madvise() + write()
gcc -O3 -finline-functions -fkeep-inline-functions -funroll-loops  -DDEFAULT_READSIZE=1 -DDO_MMAP=1 -o test-mmap test-mmap.c
/usr/bin/time ./test-mmap > /dev/null
Beginning tests with file:              services

Page size:                              4096
File read size is the same as the file size
Number of iterations:                   100000
Start time:                             1047013103.859818
Time:                                   8.4294203644

Completed tests
        7.24 real         0.41 user         5.92 sys
# Faster still, and twice as fast as the normal read() case

# The last test only calls mmap()'s once when the file is opened and
# only msync()'s, munmap()'s, close()'s the file once at exit.
gcc -O3 -finline-functions -fkeep-inline-functions -funroll-loops  -DDEFAULT_READSIZE=1 -DDO_MMAP=1 -DDO_MMAP_ONCE=1 -o test-mmap test-mmap.c
/usr/bin/time ./test-mmap > /dev/null
Beginning tests with file:              services

Page size:                              4096
File read size is the same as the file size
Number of iterations:                   100000
Start time:                             1047013111.623712
Time:                                   1.174076

Completed tests
        1.18 real         0.09 user         0.92 sys
# Substantially faster


Obviously this isn't perfect, but reading and writing data is faster
(specifically moving pages through the VM/OS).  Doing partial writes
from mmap()'ed data should be faster along with scanning through
mmap()'ed portions of - or completely mmap()'ed - files because the
pages are already loaded in the VM.  PostgreSQL's LRU file descriptor
cache could easily be adjusted to add mmap()'ing of frequently
accessed files (specifically, system catalogs come to mind).  It's not
hard to figure out how often particular files are accessed and to
either _avoid_ mmap()'ing a file that isn't accessed often, or to
mmap() files that _are_ accessed often.  mmap() does have a cost, but
I'd wager that mmap()'ing the same file a second or third time from a
different process would be more efficient.  The speedup of searching
through an mmap()'ed file may be worth it, however, to mmap() all
files if the system is under a tunable resource limit
(max_mmaped_bytes?).

If someone is so inclined or there's enough interest, I can reverse
this test case so that data is written to an mmap()'ed file, but the
same performance difference should hold true (assuming this isn't a
write to a tape drive ::grin::).

The URL for the program used to generate the above tests is at:

http://people.freebsd.org/~seanc/mmap_test/


Please ask if you have questions.  -sc

-- 
Sean Chittenden

In response to

Responses

pgsql-performance by date

Next:From: Mr.FDate: 2003-03-07 06:47:32
Subject: Re: New Interface for Win
Previous:From: Aspire SomethingDate: 2003-03-07 05:33:14
Subject: Re: New Interface for Win

pgsql-committers by date

Next:From: Tom LaneDate: 2003-03-07 14:29:46
Subject: Re: [COMMITTERS] pgsql-server/ /configure /configure.in rc/incl ...
Previous:From: Bruce Momjian - CVSDate: 2003-03-07 05:49:12
Subject: pgsql-server/doc FAQ src/FAQ/FAQ.html

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group