Skip site navigation (1) Skip section navigation (2)

Re: Caching (was Re: choosing the right platform)

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: "Ron Mayer" <ron(at)intervideo(dot)com>,"Matthew Nuzum" <cobalt(at)bearfruit(dot)org>,"'Pgsql-Performance'" <pgsql-performance(at)postgresql(dot)org>
Cc: <ron(at)intervideo(dot)com>
Subject: Re: Caching (was Re: choosing the right platform)
Date: 2003-04-11 16:15:15
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance

> In a data warehouse we have here, we load about 3 million rows
> each week; load time improved from about 9 to 7 hours
> by breaking up such queries into expressions that only require
> one sort at a time, and surrounding the expressions with
> "set sort_mem=something_big" statements to give it enough
> space to not hit the disk.
>   SET SORT_MEM=300000;
>   CREATE TEMPORARY TABLE potential_new_values AS
>      SELECT DISTINCT val FROM import_table;
>   ...
>   SET SORT_MEM=1000;
> Anyone else have similar experience, or am I doing something
> wrong to need so much SORT_MEM?

No, this sounds very reasonable to me.  I do a similar operation on one of my 
systems as part of a nightly data transformation for reporting.  Since I 
haven't had to do those on tables over 150,000 rows, I haven't seen the kind 
of RAM usage you experience.

> Below is an example of another real-world query from the same
> reporting system that benefits from a sort_mem over 32M.
> Explain analyze (below) shows a 40% improvement by having
> the sort fit in memory.

Cool!  That's a perfect example of sizing sort_mem for the query.   Mind if I 
steal it for an article at some point?

Josh Berkus
Aglio Database Solutions
San Francisco

In response to


pgsql-performance by date

Next:From: Ron MayerDate: 2003-04-11 17:25:24
Subject: Re: Caching (was Re: choosing the right platform)
Previous:From: scott.marloweDate: 2003-04-11 15:14:00
Subject: Re: Caching (was Re: choosing the right platform)

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group