Quick Links

Re: perf problem with huge table

From:	Jon Lewison <jlewison1(at)gmail(dot)com>
To:	Dave Crooke <dcrooke(at)gmail(dot)com>
Cc:	rama <rama(dot)rama(at)tiscali(dot)it>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: perf problem with huge table
Date:	2010-02-10 23:30:54
Message-ID:	7ac84c51002101530q59fa87c1w205fdf5243d9d69a@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Wed, Feb 10, 2010 at 4:16 PM, Dave Crooke <dcrooke(at)gmail(dot)com> wrote:

> Hi Rama
>
> I'm actually looking at going in the other direction ....
>
> I have an app using PG where we have a single table where we just added a
> lot of data, and I'm ending up with many millions of rows, and I'm finding
> that the single table schema simply doesn't scale.
>
> In PG, the table partitioning is only handled by the database for reads,
> for insert/update you need to do quite a lot of DIY (setting up triggers,
> etc.) so I am planning to just use named tables and generate the necessary
> DDL / DML in vanilla SQL the same way that your older code does.
>
> My experience is mostly with Oracle, which is not MVCC, so I've had to
> relearn some stuff:
>

Just a nit, but Oracle implements MVCC. 90% of the databases out there do.

> - Oracle often answers simple queries (e.g. counts and max / min) using
> only the index, which is of course pre-sorted. PG has to go out and fetch
> the rows to see if they are still in scope, and if they are stored all over
> the place on disk it means an 8K random page fetch for each row. This means
> that adding an index to PG is not nearly the silver bullet that it can be
> with some non-MVCC databases.
>
> - PG's indexes seem to be quite a bit larger than Oracle's, but that's gut
> feel, I haven't been doing true comparisons ... however, for my app I have
> limited myself to only two indexes on that table, and each index is larger
> (in disk space) than the table itself ... I have 60GB of data and 140GB of
> indexes :-)
>
> - There is a lot of row turnover in my big table (I age out data) .... a
> big delete (millions of rows) in PG seems a bit more expensive to process
> than in Oracle, however PG is not nearly as sensitive to transaction sizes
> as Oracle is, so you can cheerfully throw out one big "DELETE from FOO where
> ..." and let the database chew on it .

I find partitioning pretty useful in this scenario if the data allows is.
Aging out data just means dropping a partition rather than a delete
statement.

In response to

Re: perf problem with huge table at 2010-02-10 23:16:14 from Dave Crooke

Responses

Re: perf problem with huge table at 2010-02-10 23:48:09 from Dave Crooke

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Dave Crooke	2010-02-10 23:48:09	Re: perf problem with huge table
Previous Message	Jorge Montero	2010-02-10 23:18:36	Re: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk?