Quick Links

Re: Slow count(*) again...

From:	Samuel Gendler <sgendler(at)ideasculptor(dot)com>
To:	Scott Carey <scott(at)richrelevance(dot)com>
Cc:	Greg Smith <greg(at)2ndquadrant(dot)com>, Joshua Tolley <eggyknap(at)gmail(dot)com>, Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>, "mladen(dot)gogala(at)vmsinfo(dot)com" <mladen(dot)gogala(at)vmsinfo(dot)com>, Neil Whelchel <neil(dot)whelchel(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Slow count(*) again...
Date:	2010-10-12 16:23:06
Message-ID:	AANLkTinwiEWnTLNJ7e++KdxynQyt4p2_s7Vq_PQtYbZ8@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-performance

On Tue, Oct 12, 2010 at 9:02 AM, Scott Carey <scott(at)richrelevance(dot)com>wrote:
>
>
> However, for large reporting queries and sequential scans, XFS will win in
> the long run if you use the online defragmenter. Otherwise, your sequential
> scans won't be all that sequential on any file system over time if your
> tables aren't written once, forever, serially. Parallel restore will
> result in a system that is fragmented -- ext4 will do best at limiting this
> on the restore, but only xfs has online defragmentation. We schedule ours
> daily and it noticeably improves sequential scan I/O.
>
>
Our reporting tables are written sequentially and left unmodified until
entire partitions are dropped. However, equivalent partitions tend to get a
little bit larger over time, so newer partitions won't necessarily fit into
the gaps left by prior partition drops, so it is possible that partitions
will be split into two sections, but should still be very sequential, if not
perfectly so. It would seem that we stumbled into an ideal architecture for
doing this kind of work - mostly by virtue of starting with 8.2.x and having
huge problems with autovacuum and vacuum taking forever and dragging the db
to halt, which caused us to move to an architecture which aggregates and
then drops older data in entire partitions instead of updating aggregates
individually and then deleting rows. Partitions are sized such that most
reporting queries run over entire partitions, too (which was completely
accidental since I had not yet delved into individual query optimization at
the time), so even though we are doing sequential scans, we at least run as
few of them as possible and are able to keep hot data in memory.

--sam

In response to

Re: Slow count(*) again... at 2010-10-12 16:02:39 from Scott Carey

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Kevin Grittner	2010-10-12 16:23:56	Re: [JDBC] Support for JDBC setQueryTimeout, et al.
Previous Message	Chris Browne	2010-10-12 16:03:57	Re: Slow count(*) again...

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Scott Carey	2010-10-12 16:35:46	Re: Slow count(*) again...
Previous Message	Chris Browne	2010-10-12 16:13:09	Re: large dataset with write vs read clients