This thread covers several performance ideas. First is the idea that
more parameters should be configurable. While this seems like a noble
goal, we try to make parameters auto-tuning, or if users have to
configure it, the parameter should be useful for a significant number of
In the commercial software world, if you can convince your boss that a
feature/knob is useful, it usually gets into the product.
Unfortunately, this leads to the golden doorknob on a shack, where some
features are out of sync with the rest of the product in terms of
usefulness and utility. With open source, if a feature can not be
auto-tuned, or has significant overhead, the features has to be
implemented and then proven to be a benefit.
In terms of adding async I/O, threading, and other things, it might make
sense to explore how these could be implemented in a way that fits the
Jignesh K. Shah wrote:
> Hi Jim,
> | How many of these things are currently easy to change with a recompile?
> | I should be able to start testing some of these ideas in the near
> | future, if they only require minor code or configure changes.
> The following
> * Data File Size 1GB
> * WAL File Size of 16MB
> * Block Size of 8K
> Are very easy to change with a recompile.. A Tunable will be greatly
> prefered as it will allow one binary for different tunings
> * MultiBlock read/write
> Is not available but will greatly help in reducing the number of system
> calls which will only increase as the size of the database increases if
> something is not done about i.
> * Pregrown files... maybe not important at this point since TABLESPACE
> can currently work around it a bit (Just need to create a different file
> system for each tablespace
> But if you really think hardware & OS is the answer for all small
> things...... I think we should now start to look on how to make Postgres
> Multi-threaded or multi-processed for each connection. With the influx
> of "Dual-Core" or "Multi-Core" being the fad.... Postgres can have the
> cutting edge if somehow exploiting cores is designed.
> Somebody mentioned that adding CPU to Postgres workload halved the
> average CPU usage...
> YEAH... PostgreSQL uses only 1 CPU per connection (assuming 100%
> usage) so if you add another CPU it is idle anyway and the system will
> report only 50% :-) BUT the importing to measure is.. whether the query
> time was cut down or not? ( No flames I am sure you were talking about
> multi-connection multi-user environment :-) ) But my point is then this
> approach is worth the ROI and the time and effort spent to solve this
> I actually vote for a multi-threaded solution for each connection while
> still maintaining seperate process for each connections... This way the
> fundamental architecture of Postgres doesn't change, however a
> multi-threaded connection can then start to exploit different cores..
> (Maybe have tunables for number of threads to read data files who
> knows.. If somebody is interested in actually working a design ..
> contact me and I will be interested in assisting this work.
> Jim C. Nasby wrote:
> >On Tue, Aug 23, 2005 at 06:09:09PM -0400, Chris Browne wrote:
> >>J(dot)K(dot)Shah(at)Sun(dot)COM (Jignesh Shah) writes:
> >>>>Does that include increasing the size of read/write blocks? I've
> >>>>noticedthat with a large enough table it takes a while to do a
> >>>>sequential scan, even if it's cached; I wonder if the fact that it
> >>>>takes a million read(2) calls to get through an 8G table is part of
> >>>Actually some of that readaheads,etc the OS does already if it does
> >>>some sort of throttling/clubbing of reads/writes. But its not enough
> >>>for such types of workloads.
> >>>Here is what I think will help:
> >>>* Support for different Blocksize TABLESPACE without recompiling the
> >>>code.. (Atlease support for a different Blocksize for the whole
> >>>database without recompiling the code)
> >>>* Support for bigger sizes of WAL files instead of 16MB files
> >>>WITHOUT recompiling the code.. Should be a tuneable if you ask me
> >>>(with checkpoint_segments at 256.. you have too many 16MB files in
> >>>the log directory) (This will help OLTP benchmarks more since now
> >>>they don't spend time rotating log files)
> >>>* Introduce a multiblock or extent tunable variable where you can
> >>>define a multiple of 8K (or BlockSize tuneable) to read a bigger
> >>>chunk and store it in the bufferpool.. (Maybe writes too) (Most
> >>>devices now support upto 1MB chunks for reads and writes)
> >>>*There should be a way to preallocate files for TABLES in
> >>>TABLESPACES otherwise with multiple table writes in the same
> >>>filesystem ends with fragmented files which causes poor "READS" from
> >>>the files.
> >>>* With 64bit 1GB file chunks is also moot.. Maybe it should be
> >>>tuneable too like 100GB without recompiling the code.
> >>>Why recompiling is bad? Most companies that will support Postgres
> >>>will support their own binaries and they won't prefer different
> >>>versions of binaries for different blocksizes, different WAL file
> >>>sizes, etc... and hence more function using the same set of binaries
> >>>is more desirable in enterprise environments
> >>Every single one of these still begs the question of whether the
> >>changes will have a *material* impact on performance.
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
In response to
pgsql-performance by date
|Next:||From: Alexandre Barros||Date: 2005-08-24 14:43:05|
|Subject: performance drop on RAID5|
|Previous:||From: Donald Courtney||Date: 2005-08-24 13:21:12|
|Subject: Re: Caching by Postgres|