Skip site navigation (1) Skip section navigation (2)

Re: TB-sized databases

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Russell Smith <mr-russ(at)pws(dot)com(dot)au>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: TB-sized databases
Date: 2007-11-30 08:40:19
Message-ID: 1196412019.4246.1476.camel@ebony.site (view raw or flat)
Thread:
Lists: pgsql-performance
On Fri, 2007-11-30 at 17:41 +1100, Russell Smith wrote:
> Simon Riggs wrote:
> > On Tue, 2007-11-27 at 18:06 -0500, Pablo Alcaraz wrote:
> >   
> >> Simon Riggs wrote:
> >>     
> >>> All of those responses have cooked up quite a few topics into one. Large
> >>> databases might mean text warehouses, XML message stores, relational
> >>> archives and fact-based business data warehouses.
> >>>
> >>> The main thing is that TB-sized databases are performance critical. So
> >>> it all depends upon your workload really as to how well PostgreSQL, or
> >>> another other RDBMS vendor can handle them.
> >>>
> >>>
> >>> Anyway, my reason for replying to this thread is that I'm planning
> >>> changes for PostgreSQL 8.4+ that will make allow us to get bigger and
> >>> faster databases. If anybody has specific concerns then I'd like to hear
> >>> them so I can consider those things in the planning stages
> >>>       
> >> it would be nice to do something with selects so we can recover a rowset 
> >> on huge tables using a criteria with indexes without fall running a full 
> >> scan.
> >>
> >> In my opinion, by definition, a huge database sooner or later will have 
> >> tables far bigger than RAM available (same for their indexes). I think 
> >> the queries need to be solved using indexes enough smart to be fast on disk.
> >>     
> >
> > OK, I agree with this one. 
> >
> > I'd thought that index-only plans were only for OLTP, but now I see they
> > can also make a big difference with DW queries. So I'm very interested
> > in this area now.
> >
> >   
> If that's true, then you want to get behind the work Gokulakannan 
> Somasundaram 
> (http://archives.postgresql.org/pgsql-hackers/2007-10/msg00220.php) has 
> done with relation to thick indexes.  I would have thought that concept 
> particularly useful in DW.  Only having to scan indexes on a number of 
> join tables would be a huge win for some of these types of queries.

Hmm, well I proposed that in Jan/Feb, but I'm sure others have also.

I don't think its practical to add visibility information to *all*
indexes, but I like Heikki's Visibility Map proposal much better.

> My tiny point of view would say that is a much better investment than 
> setting up the proposed parameter.  

They are different things entirely, with dissimilar dev costs also. We
can have both.

> I can see the use of the parameter 
> though. 

Good

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


In response to

pgsql-performance by date

Next:From: Robert TreatDate: 2007-11-30 09:15:09
Subject: Re: Training Recommendations
Previous:From: Russell SmithDate: 2007-11-30 06:41:53
Subject: Re: TB-sized databases

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group