Skip site navigation (1) Skip section navigation (2)

Re: database size growing continously

From: Jeremy Harris <jgh(at)wizmail(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: database size growing continously
Date: 2009-10-30 20:18:45
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
On 10/30/2009 08:01 PM, Greg Stark wrote:
> On Fri, Oct 30, 2009 at 12:53 PM, Anj Adu<fotographs(at)gmail(dot)com>  wrote:
>> Any relational database worth its salt has partitioning for a reason.
>> 1. Maintenance.  You will need to delete data at some
>> point.(cleanup)...Partitions are the only way to do it effectively.
> This is true and it's unavoidably a manual process. The database will
> not know what segments of the data you intend to load and unload en
> masse.
>> 2. Performance.  Partitioning offer a way to query smaller slices of
>> data automatically (i.e the query optimizer will choose the partition
>> for you) ...very large tables are a no-no in any relational
>> database.(sheer size has limitations)
> This I dispute. Databases are designed to be scalable and very large
> tables should perform just as well as smaller tables.
> Where partitions win for performance is when you know something about
> how your data is accessed and you can optimize the access by
> partitioning along the same keys. For example if you're doing a
> sequential scan of just one partition or doing a merge join of two
> equivalently partitioned tables and the partitions can be sorted in
> memory.
> However in these cases it is possible the database will become more
> intelligent and be able to achieve the same performance gains
> automatically. Bitmap index scans should perform comparably to the
> sequential scan of individual partitions for example.

So, on the becoming more intelligent front:  PostgreSQL already does
some operations as background maintenance (autovacuum).  Extending
this to de-bloat indices does not seem conceptually impossible, nor for
the collection of table-data statistics for planner guidance (also, why
could a full-table-scan not collect stats as a side-effect?).  Further out,
how about the gathering of statistics on queries to guide the automatic
creation of indices?  Or to set up a partitioning scheme on a previously
monolithic table?

- Jeremy

In response to


pgsql-performance by date

Next:From: Greg StarkDate: 2009-10-30 22:16:44
Subject: Re: database size growing continously
Previous:From: Anj AduDate: 2009-10-30 20:11:06
Subject: Re: database size growing continously

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group