Re: Configuration Recommendations

From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Configuration Recommendations
Date: 2012-05-15 17:53:57
Message-ID: 66570518e99910b4b4509564331e87c2@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

>>> Is it established practice in the Postgres world to separate indexes
>>> from tables? I would assume that the reasoning of Richard Foote -
>>> albeit for Oracle databases - is also true for Postgres:
>
>> Yes, it's an established practice. I'd call it something just short of
>> a best practice though, as it really depends on your situation.
>
> What are the benefits?

Disk seeks, basically. Yes, there are a lot of complications regarding
all the various hardware/OS/PG level cachings, but at the end of the
day, it's less work to have each drive concentrate on a single area
(especially as we always require a heap scan at the moment).

>> I also find his examples a bit contrived, and the whole "multi-user"
>> argument irrelevant for common cases.
>
> Why is that?

Because most Postgres servers are dedicated to serving the same data
or sets of data, and the number of "other users" calling ad-hoc queries
against lots of different tables (per his example) is small. So this
sentence just doesn't ring true to me:

" ... by the time weve read the index leaf block, processed and
read all the associated table blocks referenced by the index leaf
block, the chances of there being no subsequent physical activity
in the index tablespace due to another user session is virtually
nil. We would still need to re-scan the disk to physically access
the next index leaf block (or table block) anyways."

That's certainly not true for Postgres servers, and I doubt if it
is quite that bad on Oracle either.

>> I lean towards using separate tablespaces in Postgres, as the
>> performance outweighs the additional>> complexity.

> What about his argument with regards to access patterns (i.e.
> interleaving index and table access during an index scan)? Also,
> Shaun's advice to have more spindles available sounds convincing to
> me, too.

I don't buy his arguments. To do so, you'd have to buy a key point:

"when most physical I/Os in both index and table segments are
effectively random, single block reads"

They are not; hence, the rest of his argument falls apart. Certainly,
if things were as truly random and uncached as he states, there would
be no benefit to separation.

As far as spindles, yes: like RAM, it's seldom the case to have
too litte :) But as with all things, one should get some benchmarks
on your specific workload before making hardware changes. (Well, RAM
may be an exception to that, up to a point).

>> It's down on the tuning list however: much more important
>> is getting your kernel/volumes configured correctly, allocating
>> shared_buffers sanely, separating pg_xlog, etc.

> That does make a lot of sense. Separating pg_xlog would probably the
> first thing I'd do especially since the IO pattern is so dramatically
> different from tablespace IO access patterns.

Yep - moving pg_xlog to something optimized for small, constantly
written files is one of the biggest and easiest wins. Other than
fsync = off ;)

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201205151351
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAk+yl8YACgkQvJuQZxSWSshB+QCghfweMspFIqmP4rLv6/tcGPot
jscAn1SZAP1/KBcu/FEpWXilSnWjlA6Z
=FX7j
-----END PGP SIGNATURE-----

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Rosser Schwarz 2012-05-15 18:16:25 Re: SSD selection
Previous Message Merlin Moncure 2012-05-15 17:22:20 Re: SSD selection