From: | Manfred Koizar <mkoi-pg(at)aon(dot)at> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [GENERAL] Large DB |
Date: | 2004-04-03 00:40:39 |
Message-ID: | 8ftr60l1ebgcable559ogr2tlb6nuujllq@email.aon.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
On Fri, 02 Apr 2004 18:06:12 -0500, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>You should not need to use the Vitter algorithm for the block-level
>selection, since you can know the number of blocks in the table in
>advance. You can just use the traditional method of choosing each block
>or not with probability (k/K), where k = number of sample blocks still
>needed, K = number of blocks from here to the end.
Sounds reasonable. I have to play around a bit more to get a feeling
where the Vitter method gets more efficient.
> You'd run the Vitter
>algorithm separately to decide whether to keep or discard each live row
>you find in the blocks you read.
You mean once a block is sampled we inspect it in any case? This was
not the way I had planned to do it, but I'll keep this idea in mind.
>Question: if the table size is less than N blocks, are you going to read
>every block or try to reduce the number of blocks sampled?
Don't know yet.
>people are setting the stats target to 100 which means a sample size of
>30000 --- how do the page-access counts look in that case?
rel | page
size | reads
------+-------------
300 | 300
3000 | 3000
5000 | 4999
10K | 9.9K
30K | 25.8K
300K | 85K
1M | 120K
10M | 190K
100M | 260K
1G | 330K
This is exactly the table I posted before (for sample size 3000) with
every entry multiplied by 10. Well, not quite exactly, but the
differences are far behind the decimal point. So for our purposes, for
a given relation size the number of pages accessed is proportional to
the sample size.
Servus
Manfred
From | Date | Subject | |
---|---|---|---|
Next Message | Manfred Koizar | 2004-04-03 00:54:31 | Re: Casting int to bool with join... |
Previous Message | Randall Skelton | 2004-04-02 23:21:34 | Re: Casting int to bool with join... |
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Seymour | 2004-04-03 00:52:45 | Re: Problems Vacuum'ing |
Previous Message | Tom Lane | 2004-04-03 00:35:20 | Re: Problems Vacuum'ing |