Re: Slow count(*) again...

From: Dan Harris <fbsd(at)drivefaster(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow count(*) again...
Date: 2010-10-13 00:00:11
Message-ID: 4CB4F68B.1010709@drivefaster.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On 10/12/10 4:33 PM, Neil Whelchel wrote:
> On Tuesday 12 October 2010 08:39:19 Dan Harris wrote:
>> On 10/11/10 8:02 PM, Scott Carey wrote:
>>> would give you a 1MB read-ahead. Also, consider XFS and its built-in
>>> defragmentation. I have found that a longer lived postgres DB will get
>>> extreme file fragmentation over time and sequential scans end up mostly
>>> random. On-line file defrag helps tremendously.
>> We just had a corrupt table caused by an XFS online defrag. I'm scared
>> to use this again while the db is live. Has anyone else found this to
>> be safe? But, I can vouch for the fragmentation issue, it happens very
>> quickly in our system.
>>
>> -Dan
> I would like to know the details of what was going on that caused your
> problem. I have been using XFS for over 9 years, and it has never caused any
> trouble at all in a production environment. Sure, I had many problems with it
> on the test bench, but in most cases the issues were very clear and easy to
> avoid in production. There were some (older) XFS tools that caused some
> problems, but that is in the past, and as time goes on, it seems take less and
> less planning to make it work properly.
> -Neil-
>
There were roughly 50 transactions/sec going on at the time I ran it.
xfs_db reported 99% fragmentation before it ran ( we haven't been
running it via cron ). The operation completed in about 15 minutes (
360GB of used data on the file system ) with no errors. Everything
seemed fine until the next morning when a user went to query a table we
got a message about a "missing" file inside the pg cluster. We were
unable to query the table at all via psql. It was a bit of a panic
situation so we restored that table from backup immediately and the
problem was solved without doing more research.

This database has been running for years with no problem ( and none
since ), that was the first time I tried to do an on-line defrag and
that was the only unusual variable introduced into the system at that
time so it was a strong enough correlation for me to believe that caused
it. Hopefully this was just a corner case..

-Dan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Darren Duncan 2010-10-13 00:08:07 Re: SQL command to edit postgresql.conf, with comments
Previous Message Tom Lane 2010-10-12 23:49:28 Re: Review: Fix snapshot taking inconsistencies

Browse pgsql-performance by date

  From Date Subject
Next Message Ogden 2010-10-13 00:23:24 Re: Query much faster with enable_seqscan=0
Previous Message Neil Whelchel 2010-10-12 23:19:33 Re: Slow count(*) again...