Re: Prefetch

From: Sam Vilain <sam(at)vilain(dot)net>
To: molson(at)oceanconsulting(dot)com
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Prefetch
Date: 2005-05-11 04:33:50
Message-ID: 42818B2E.8060403@vilain.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Matt Olson wrote:
> Other databases like Oracle and DB2 implement some sort of row prefetch. Has
> there been serious consideration of implementing something like a prefetch
> subsystem? Does anyone have any opinions as to why this would be a bad idea
> for postgres?
> Postges is great for a multiuser environment and OLTP applications. However,
> in this set up, a data warehouse, the observed performance is not what I
> would hope for.

Oracle doesn't pre-fetch data to get its fast results in this case.
pre-fetching doesn't give you the 100 times speed increases.

Bitmap indexes are very important for data mining. You might want to see

http://www.it.iitb.ac.in/~rvijay/dbms/proj/

I have no idea how well developed this is, but this is often the biggest
win with Data Warehousing. If it works, you'll get results back in seconds,
if it doesn't you'll have plenty of time while your queries run to reflect on
the possibility that commercial databases might actually have important features
that haven't even penetrated the awareness of most free database developers.

Another trick you can use with large data sets like this when you want results
back in seconds is to have regularly updated tables that aggregate the data
along each column normally aggregated against the main data set.

Of couse, Pg doesn't have the nice features that make this just work and make
queries against your data source faster (called "OLAP Query rewrite" in
Oracle), so you'll have to put a lot of work into changing your application to
figure out when to use the summary tables. As far as I know it doesn't have
materialized views, either, so updating these summary tables is also a more
complex task than just a single REFRESH command.

Maybe some bright person will prove me wrong by posting some working
information about how to get these apparently absent features working.

You might also want to consider ditching RAID 5 and switching to plain
mirroring. RAID 5 is a helluva performance penalty (by design). This is
why they said RAID - fast, cheap, reliable - pick any two. RAID 5 ain't
fast. But that's probably not your main problem.

Sam.

>
> Regards,
>
> Matt Olson
> Ocean Consulting
> http://www.oceanconsulting.com/
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org

In response to

  • Prefetch at 2005-05-10 13:52:51 from Matt Olson

Responses

  • Re: Prefetch at 2005-05-11 04:53:05 from Christopher Kings-Lynne

Browse pgsql-performance by date

  From Date Subject
Next Message Neil Conway 2005-05-11 04:39:22 Re: Partitioning / Clustering
Previous Message Bruno Wolff III 2005-05-11 04:12:24 Re: Partitioning / Clustering