Skip site navigation (1) Skip section navigation (2)

Re: seq scan cache vs. index cache smackdown

From: "Iain" <iain(at)mst(dot)co(dot)jp>
To: <pgsql-performance(at)postgresql(dot)org>,"Mark Aufflick" <mark(at)pumptheory(dot)com>
Subject: Re: seq scan cache vs. index cache smackdown
Date: 2005-02-15 03:55:33
Message-ID: 00a001c51312$35fb5510$7201a8c0@mst1x5r347kymb (view raw or flat)
Thread:
Lists: pgsql-performance
Hi,

I think there was some discussion about seq scans messing up the cache, and 
talk about doing something about it but I don't think it has been addressed 
yet. Maybe worth a troll through the archives.

It is certainly true that in many situations, a seq scan is preferable to 
using an index. I have been testing a situation here on two versions of the 
same database, one of the databases is much bigger than the other 
(artificially bloated for testing purposes). Some of the query plans change 
to use seq scans on the big database, where they used indexes on the little 
database - but either way, in *single user* testing the performance is fine. 
My concern is that this kind of testing has very little relevance to the 
real world of multiuser processing where contention for the cache becomes an 
issue.  It may be that, at least in the current situation, postgres is 
giving too much weight to seq scans based on single user, straight line 
performance comparisons. If your assumption is correct, then addressing that 
might help, though it's bound to have it's compromises too...

regards
Iain




----- Original Message ----- 
From: "Mark Aufflick" <mark(at)pumptheory(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Sent: Tuesday, February 15, 2005 8:34 AM
Subject: [PERFORM] seq scan cache vs. index cache smackdown


> Hi All,
>
> I have boiled my situation down to the following simple case: (postgres 
> version 7.3)
>
> * Query 1 is doing a sequential scan over a table (courtesy of field ILIKE 
> 'foo%') and index joins to a few others
> * Query 2 is doing a functional index scan over the same table 
> (lower(field) LIKE 'foo%') and index joins to a few others
> * neither query has an order by clause
> * for the purpose of testing, both queries are designed to return the same 
> result set
>
> Obviously Q2 is faster than Q1, but if I ever run them both at the same 
> time (lets say I run two of Q1 and one of Q2 at the same time) then Q2 
> consistently returns WORSE times than Q1 (explain analyze confirms that it 
> is using the index).
>
> My assumption is that the sequential scan is blowing the index from any 
> cache it might live in, and simultaneously stealing all the disk IO that 
> is needed to access the index on disk (the table has 200,000 rows).
>
> If I simplify the case to not do the index joins (ie. operate on the one 
> table only) the situation is not as dramatic, but similar.
>
> My thoughts are:
>
> 1) kill the sequential scan - but unfortunately I don't have direct 
> control over that code
> 2) change the way the server allocates/prioritizes different caches - i 
> don't know enough about how postgres caches work to do this (if it's 
> possible)
> 3) try it on postgres 7.4 - possible, but migrating the system to 7.4 in 
> production will be hard because the above code that I am not responsible 
> for has a lot of (slightly wacky) implicit date casts
> 4) ask the fine people on the mailing list for other suggestions!
> -- 
> Mark Aufflick
>   e  mark(at)pumptheory(dot)com
>   w  www.pumptheory.com (work)
>   w  mark.aufflick.com (personal)
>   p  +61 438 700 647
>   f  +61 2 9436 4737
>
>
> ========================================================================
> iBurst Wireless Broadband from $34.95/month   www.platformnetworks.net
> Forward undetected SPAM to:                   spam(at)mailsecurity(dot)net(dot)au
> ========================================================================
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings 


In response to

Responses

pgsql-performance by date

Next:From: Rod TaylorDate: 2005-02-15 04:20:51
Subject: Re: seq scan cache vs. index cache smackdown
Previous:From: IainDate: 2005-02-15 02:31:14
Subject: Re: VACCUM FULL ANALYZE PROBLEM

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group