Re: Are 50 million rows a problem for postgres ?

From: Sam Barnett-Cormack <s(dot)barnett-cormack(at)lancaster(dot)ac(dot)uk>
To: Vasilis Ventirozos <vendi(at)cosmoline(dot)com>
Cc: Sam Barnett-Cormack <s(dot)barnett-cormack(at)lancaster(dot)ac(dot)uk>, PostgreSQL Mail List <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Are 50 million rows a problem for postgres ?
Date: 2003-09-08 11:43:03
Message-ID: Pine.LNX.4.50.0309081236590.421-100000@short.lancs.ac.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Mon, 8 Sep 2003, Vasilis Ventirozos wrote:

This bit is simple and probably wants leaving alone, except for very
specific changes as you need them

> # Connection Parameters
> #
> tcpip_socket = true
> #ssl = false
>
> #max_connections = 32
> #superuser_reserved_connections = 2
>
> port = 5432
> #hostname_lookup = false
> #show_source_port = false
>
> #unix_socket_directory = ''
> #unix_socket_group = ''
> #unix_socket_permissions = 0777 # octal
>
> #virtual_host = ''
>
> #krb_server_keyfile = ''

The next two sections are most important:

> #
> # Shared Memory Size
> #
> #shared_buffers = 64 # min max_connections*2 or 16, 8KB each
> #max_fsm_relations = 1000 # min 10, fsm is free space map, ~40 bytes
> #max_fsm_pages = 10000 # min 1000, fsm is free space map, ~6 bytes
> #max_locks_per_transaction = 64 # min 10
> #wal_buffers = 8 # min 4, typically 8KB each

These probably want upping, quite a lot for your system

max_fsm_relations at least 5000, I'd say, max_fsm_pages 50000,
shared_buffers can be made to be huge

> #
> # Non-shared Memory Sizes
> #
> #sort_mem = 1024 # min 64, size in KB
> #vacuum_mem = 8192 # min 1024, size in KB# Write-ahead log
> (WAL)

These want upping, quite a lot. Like:

sort_mem = 65536
vacuum_mem = 32768

Are what I use, and I have a much lower-power box

> #checkpoint_segments = 3 # in logfile segments, min 1, 16MB each
> #checkpoint_timeout = 300 # range 30-3600, in seconds
> #
> #commit_delay = 0 # range 0-100000, in microseconds
> #commit_siblings = 5 # range 1-1000
> #
> #fsync = true
> #wal_sync_method = fsync # the default varies across platforms:
> # # fsync, fdatasync, open_sync, or
> open_datasync
> #wal_debug = 0 # range 0-16

Less sure about that lot.

The next section is only used by the query planner, debug it if you have
silly plans being generated. You probably want to change
random_page_cost

> #
> # Optimizer Parameters
> #
> #enable_seqscan = true
> #enable_indexscan = true
> #enable_tidscan = true
> #enable_sort = true
> #enable_nestloop = true
> #enable_mergejoin = true
> #enable_hashjoin = true
>
> #effective_cache_size = 1000 # typically 8KB each
> #random_page_cost = 4 # units are one sequential page fetch cost
> #cpu_tuple_cost = 0.01 # (same)
> #cpu_index_tuple_cost = 0.001 # (same)
> #cpu_operator_cost = 0.0025 # (same)
>
> #default_statistics_target = 10 # range 1-1000
> # GEQO Optimizer Parameters
> #
> #geqo = true
> #geqo_selection_bias = 2.0 # range 1.5-2.0
> #geqo_threshold = 11
> #geqo_pool_size = 0 # default based on tables in statement,
> # range 128-1024
> #geqo_effort = 1
> #geqo_generations = 0
> #geqo_random_seed = -1 # auto-compute seed

A lot of system stuff now, scroll down...

> #
> # Message display
> #
> #server_min_messages = notice # Values, in order of decreasing detail:
> # debug5, debug4, debug3, debug2, debug1,
> # info, notice, warning, error, log, fatal,
> # panic
> #client_min_messages = notice # Values, in order of decreasing detail:
> # debug5, debug4, debug3, debug2, debug1,
> # log, info, notice, warning, error
> #silent_mode = false
>
> log_connections = true
> log_pid = true
> log_statement = true
> log_duration = true
> log_timestamp = true
>
> #log_min_error_statement = panic # Values in order of increasing severity:
> # debug5, debug4, debug3, debug2, debug1,
> # info, notice, warning, error, panic(off)
>
> #debug_print_parse = false
> #debug_print_rewritten = false
> #debug_print_plan = false
> #debug_pretty_print = false
>
> #explain_pretty_print = true
>
> # requires USE_ASSERT_CHECKING
> #debug_assertions = true
>
>
> #
> # Syslog
> #
> #syslog = 0 # range 0-2
> #syslog_facility = 'LOCAL0'
> #syslog_ident = 'postgres'
>
>
> #
> # Statistics
> #
> #show_parser_stats = false
> #show_planner_stats = false
> #show_executor_stats = false
> #show_statement_stats = false
>
> # requires BTREE_BUILD_STATS
> #show_btree_build_stats = false
>
>
> #
> # Access statistics collection
> #
> #stats_start_collector = true
> #stats_reset_on_server_start = true
>
> #stats_command_string = false
> #stats_row_level = false
> #stats_block_level = false
>
>
> #
> # Lock Tracing
> #
> #trace_notify = false
>
> # requires LOCK_DEBUG
> #trace_locks = false
> #trace_userlocks = false
> #trace_lwlocks = false
> #debug_deadlocks = false
> #trace_lock_oidmin = 16384
> #trace_lock_table = 0

Some useful stuff in Misc

deadlock timeout probably wants increasing, for safety's sake. Other
things can be played with.

> #
> # Misc
> #
> #autocommit = true
> #dynamic_library_path = '$libdir'
> #search_path = '$user,public'
> #datestyle = 'iso, us'
> #timezone = unknown # actually, defaults to TZ environment setting
> #australian_timezones = false
> #client_encoding = sql_ascii # actually, defaults to database encoding
> #authentication_timeout = 60 # 1-600, in seconds
> #deadlock_timeout = 1000 # in milliseconds
> #default_transaction_isolation = 'read committed'
> #max_expr_depth = 10000 # min 10
> #max_files_per_process = 1000 # min 25
> #password_encryption = true
> #sql_inheritance = true
> #transform_null_equals = false
> #statement_timeout = 0 # 0 is disabled, in milliseconds
> #db_user_namespace = false
>
>
>
> #
> # Locale settings
> #
> # (initialized by initdb -- may be changed)
> LC_MESSAGES = 'C'
> LC_MONETARY = 'C'
> LC_NUMERIC = 'C'
> LC_TIME = 'C'
>
>

Hope some of that helps.

--

Sam Barnett-Cormack
Software Developer | Student of Physics & Maths
UK Mirror Service (http://www.mirror.ac.uk) | Lancaster University

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Christopher Browne 2003-09-08 12:14:57 Re: Performance Issues
Previous Message Vasilis Ventirozos 2003-09-08 11:31:30 Re: Are 50 million rows a problem for postgres ?