Re: Index Scans become Seq Scans after VACUUM ANALYSE

From: Mark Pritchard <mark(at)tangent(dot)net(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Luis Alberto Amigo Navarro <lamigo(at)atc(dot)unican(dot)es>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Index Scans become Seq Scans after VACUUM ANALYSE
Date: 2002-04-18 01:49:03
Message-ID: 1019094543.25576.113.camel@set.tangent.net.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I threw together the attached program (compiles fine with gcc 2.95.2 on
Solaris 2.6 and egcs-2.91.66 on RedHat Linux 6.2) and ran it a few
times. Data is below. Usual disclaimers about hastily written code etc
:)

Machine = ghoul (generic intel, 384mb ram, dual p3-800, ide disk running
dma)

Sequential
Bytes Read Time Bytes / Sec
536870912 27.14 19783933.74
536870912 27.14 19783990.60
536870912 27.11 19801872.14
536870912 26.92 19942928.41
536870912 27.31 19657408.43
19794026.66 (avg)

Random
Bytes Read Time Bytes / Sec
1073741824 519.57 2066589.21
1073741824 517.78 2073751.44
1073741824 516.92 2077193.23
1073741824 513.18 2092333.29
1073741824 510.68 2102579.88
2082489.41 (avg)

Machine = jedi (Sun E420, 3gb ram, dual 400s, test on single scsi disk)

Sequential
Bytes Read Time Bytes / Sec
2097152000 65.19 32167675.28
2097152000 65.22 32154114.65
2097152000 65.16 32182561.99
2097152000 65.12 32206105.12
2097152000 64.67 32429463.26
32227984.06 (avg)

Random
Bytes Read Time Bytes / Sec
4194304000 1522.22 2755394.79
4194304000 278.18 15077622.05
4194304000 91.43 45874730.07
4194304000 61.43 68273795.19
4194304000 54.55 76890231.51
41774354.72

If I interpret Tom's "divide" instruction correctly, is that a factor of
10 on the linux box?

On Thu, 2002-04-18 at 01:16, Tom Lane wrote:
> "Luis Alberto Amigo Navarro" <lamigo(at)atc(dot)unican(dot)es> writes:
> > On my own few experience I think this could be solved decreasing
> > random_page_cost, if you would prefer to use indexes than seq scans, then
> > you can lower random_page_cost to a point in which postgres works as you
> > want. So the planner would prefer indexes when in standard conditions it
> > would prefer seq scans.
>
> It's entirely possible that the default value of random_page_cost is too
> high, at least for many modern machines. The experiments I did to get
> the 4.0 figure were done a couple years ago, on hardware that wasn't
> exactly new at the time. I have not heard of anyone else trying to
> measure it though.
>
> I don't think I have the source code I used anymore, but the principle
> is simple enough:
>
> 1. Make a large file (several times the size of your machine's RAM, to
> ensure you swamp out kernel disk buffering effects). Fill with random
> data. (NB: do not fill with zeroes, some filesystems optimize this away.)
>
> 2. Time reading the file sequentially, 8K per read request.
> Repeat enough to get a statistically trustworthy number.
>
> 3. Time reading randomly-chosen 8K pages from the file. Repeat
> enough to get a trustworthy number (the total volume of pages read
> should be several times the size of your RAM).
>
> 4. Divide.
>
> The only tricky thing about this is making sure you are measuring disk
> access times and not being fooled by re-accessing pages the kernel still
> has cached from a previous access. (The PG planner does try to account
> for caching effects, but that's a separate estimate; the value of
> random_page_cost isn't supposed to include caching effects.) AFAIK the
> only good way to do that is to use a large test, which means it takes
> awhile to run; and you need enough spare disk space for a big test file.
>
> It'd be interesting to get some numbers for this across a range of
> hardware, filesystems, etc ...
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>

Attachment Content-Type Size
rndpgcst.c text/x-c 6.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Pritchard 2002-04-18 01:51:38 Re: Index Scans become Seq Scans after VACUUM ANALYSE
Previous Message Bruce Momjian 2002-04-18 01:41:07 Re: compile bug in HEAD?