Re: understanding postgres issues/bottlenecks

From: david(at)lang(dot)hm
To: Jean-David Beyer <jeandavid8(at)verizon(dot)net>
Cc: pgsql performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: understanding postgres issues/bottlenecks
Date: 2009-01-16 08:59:53
Message-ID: alpine.DEB.1.10.0901160049330.16879@asgard.lang.hm
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, 15 Jan 2009, Jean-David Beyer wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> M. Edward (Ed) Borasky wrote:
> | Luke Lonergan wrote:
> |> Not to mention the #1 cause of server faults in my experience: OS
> |> kernel bug causes a crash. Battery backup doesn't help you much there.
> |>
> |
> | Well now ... that very much depends on where you *got* the server OS and
> | how you administer it. If you're talking a correctly-maintained Windows
> | 2003 Server installation, or a correctly-maintained Red Hat Enterprise
> | Linux installation, or any other "branded" OS from Novell, Sun, HP, etc.,
> | I'm guessing such crashes are much rarer than what you've experienced.
> |
> | And you're probably in pretty good shape with Debian stable and the RHEL
> | respins like CentOS. I can't comment on Ubuntu server or any of the BSD
> | family -- I've never worked with them. But you should be able to keep a
> | "branded" server up for months, with the exception of applying security
> | patches that require a reboot. And *those* can be *planned* outages!
> |
> | Where you *will* have some major OS risk is with testing-level software
> | or "bleeding edge" Linux distros like Fedora. Quite frankly, I don't know
> | why people run Fedora servers -- if it's Red Hat compatibility you want,
> | there's CentOS.
> |
> Linux kernels seem to be pretty good these days. I ran Red Hat Linux 7.3
> 24/7 for over 6 months, and it was discontinued years ago. I recognize that
> this is by no means a record. It did not crash after 6 months, but I
> upgraded that box to CentOS 4 and it has been running that a long time. That
> box has minor hardware problems that do not happen often enough to find the
> real cause. But it stays up months at a time. All that box does is run BOINC
> and a printer server (CUPS).
>
> This machine does not crash, but it gets rebooted whenever a new kernel
> comes out, and has been up almost a month. It run RHEL5.
>
> I would think Fedora's kernel would probably be OK, but the other bleeding
> edge stuff I would not risk a serious server on.

I have been running kernel.org kernels in production for about 12 years
now (on what has now grown to a couple hundred servers), and I routinely
run from upgrade to upgrade with no crashes. I tend to upgrade every year
or so).

that being said, things happen. I have a set of firewalls running the
Checkpoint Secure Platform linux distribution that locked up solidly a
couple weeks after putting them in place (the iptables firewalls that they
replaced had been humming along just fine under much heavier loads for
months).

the more mainstream your hardware is the safer you are (unfortunantly very
few RAID cards are mainstream), but I've also found that by compiling a
minimal kernel that only supports the stuff that I need also contributes
to reliability.

but even with my experiance, I would never architect anything with the
expectation that system crashes don't happen. I actually see more crashes
due to overheating (fans fail, AC units fail, etc) than I do from kernel
crashes.

not everything needs reliability. I am getting ready to build a pair of
postgres servers that will have all safety disabled. I will get the
redundancy I need by replicating between the pair, and if they both go
down (datacenter outage) it is very appropriate to loose the entire
contents of the system and reinitialize from scratch (in fact, every boot
of the system will do this)

but you need to think carefully about what you are doing when you disable
the protection.

David Lang

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message David Rees 2009-01-16 19:50:33 Re: Slow insert performace, 8.3 Wal related?
Previous Message M. Edward (Ed) Borasky 2009-01-16 06:03:01 Re: understanding postgres issues/bottlenecks