Skip site navigation (1) Skip section navigation (2)

Re: Question about disk IO an index use and seeking advice

From: PFC <lists(at)peufeu(dot)com>
To: "Nikolas Everett" <nik9000(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Question about disk IO an index use and seeking advice
Date: 2008-04-24 20:02:40
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
> An index scan looks through the index and pulls in each pages as it sees  
> it.
> A bitmap index scan looks through the index and makes a sorted list of  
> all
> the pages it needs and then the bitmap heap scan reads all the pages.
> If your data is scattered then you may as well do the index scan, but if
> your data is sequential-ish then you should do the bitmap index scan.
> Is that right?  Where can I learn more?  I've read

	That's about it, yes.
	If your bitmap has large holes, it will seek, but if it has little holes,  
readahead will work. Hence, fast, and good.
	On indexscan, readahead doesn't help since the hits are pretty random. If  
you have N rows in the index with the same date, in which order whill they  
get scanned ? There is no way to know that, and no way to be sure this  
order corresponds to physical order on disk.

> About clustering:  I know that CLUSTER takes an exclusive lock on the
> table.  At present, users can query the table at any time, so I'm not
> allowed to take an exclusive lock for more than a few seconds.

	Then, CLUSTER is out.

> Could I
> achieve the same thing by creating a second copy of the table and then
> swapping the first copy out for the second?  I think something like that
> would fit in my time frames

	If the archive table is read-only, then yes, you can do this.
> About partitioning:  I can definitely see how having the data in more
> manageable chunks would allow me to do things like clustering.  It will
> definitely make vacuuming easier.
> About IO speeds:  The db is always under some kind of load.  I actually  
> get
> scared if the load average isn't at least 2.  Could I try to run  
> something
> like bonnie++ to get some real load numbers?  I'm sure that would cripple
> the system while it is running, but if it only takes a few seconds that
> would be ok.
> There were updates running while I was running the test.  The WAL log is  
> on
> the hardware raid 10.  Moving it from the software raid 5 almost doubled  
> our
> insert performance.

	Normal ; fsync on a RAID5-6 is bad, bad.
	You have battery backed up cache ?

> Thanks again,
> --Nik

In response to

pgsql-performance by date

Next:From: Marc G. FournierDate: 2008-04-24 22:00:11
Subject: Re: off-topic: SPAM
Previous:From: Alvaro HerreraDate: 2008-04-24 18:24:04
Subject: Re: off-topic: SPAM

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group