From: | Jon Lewison <jlewison1(at)gmail(dot)com> |
---|---|
To: | Dave Crooke <dcrooke(at)gmail(dot)com> |
Cc: | rama <rama(dot)rama(at)tiscali(dot)it>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: perf problem with huge table |
Date: | 2010-02-10 23:30:54 |
Message-ID: | 7ac84c51002101530q59fa87c1w205fdf5243d9d69a@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Wed, Feb 10, 2010 at 4:16 PM, Dave Crooke <dcrooke(at)gmail(dot)com> wrote:
> Hi Rama
>
> I'm actually looking at going in the other direction ....
>
> I have an app using PG where we have a single table where we just added a
> lot of data, and I'm ending up with many millions of rows, and I'm finding
> that the single table schema simply doesn't scale.
>
> In PG, the table partitioning is only handled by the database for reads,
> for insert/update you need to do quite a lot of DIY (setting up triggers,
> etc.) so I am planning to just use named tables and generate the necessary
> DDL / DML in vanilla SQL the same way that your older code does.
>
> My experience is mostly with Oracle, which is not MVCC, so I've had to
> relearn some stuff:
>
Just a nit, but Oracle implements MVCC. 90% of the databases out there do.
> - Oracle often answers simple queries (e.g. counts and max / min) using
> only the index, which is of course pre-sorted. PG has to go out and fetch
> the rows to see if they are still in scope, and if they are stored all over
> the place on disk it means an 8K random page fetch for each row. This means
> that adding an index to PG is not nearly the silver bullet that it can be
> with some non-MVCC databases.
>
> - PG's indexes seem to be quite a bit larger than Oracle's, but that's gut
> feel, I haven't been doing true comparisons ... however, for my app I have
> limited myself to only two indexes on that table, and each index is larger
> (in disk space) than the table itself ... I have 60GB of data and 140GB of
> indexes :-)
>
> - There is a lot of row turnover in my big table (I age out data) .... a
> big delete (millions of rows) in PG seems a bit more expensive to process
> than in Oracle, however PG is not nearly as sensitive to transaction sizes
> as Oracle is, so you can cheerfully throw out one big "DELETE from FOO where
> ..." and let the database chew on it .
I find partitioning pretty useful in this scenario if the data allows is.
Aging out data just means dropping a partition rather than a delete
statement.
From | Date | Subject | |
---|---|---|---|
Next Message | Dave Crooke | 2010-02-10 23:48:09 | Re: perf problem with huge table |
Previous Message | Jorge Montero | 2010-02-10 23:18:36 | Re: Re: 512,600ms query becomes 7500ms... but why? Postgres 8.3 query planner quirk? |