Re: Warm-cache prefetching

From: "Min Xu (Hsu)" <xu(at)cs(dot)wisc(dot)edu>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Warm-cache prefetching
Date: 2005-12-09 04:46:54
Message-ID: 20051209044649.GF2189@cs.wisc.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Perhaps because P4 is already doing H/W prefetching?

http://www.tomshardware.com/2000/11/20/intel/page5.html

I ran the test program on an opteron 2.2G:

% ./a.out 10 16
Sum: -951304192: with prefetch on - duration: 81.166 ms
Sum: -951304192: with prefetch off - duration: 79.769 ms
Sum: -951304192: with prefetch on - duration: 81.173 ms
Sum: -951304192: with prefetch off - duration: 79.718 ms
Sum: -951304192: with prefetch on - duration: 83.293 ms
Sum: -951304192: with prefetch off - duration: 79.731 ms
Sum: -951304192: with prefetch on - duration: 81.227 ms
Sum: -951304192: with prefetch off - duration: 79.851 ms
Sum: -951304192: with prefetch on - duration: 81.003 ms
Sum: -951304192: with prefetch off - duration: 79.724 ms
Sum: -951304192: with prefetch on - duration: 81.084 ms
Sum: -951304192: with prefetch off - duration: 79.728 ms
Sum: -951304192: with prefetch on - duration: 81.009 ms
Sum: -951304192: with prefetch off - duration: 79.723 ms
Sum: -951304192: with prefetch on - duration: 81.074 ms
Sum: -951304192: with prefetch off - duration: 79.719 ms
Sum: -951304192: with prefetch on - duration: 81.188 ms
Sum: -951304192: with prefetch off - duration: 79.724 ms
Sum: -951304192: with prefetch on - duration: 81.075 ms
Sum: -951304192: with prefetch off - duration: 79.719 ms

Got slowdown.

I ran it on a PIII 650M:

% ./a.out 10 16
Sum: -951304192: with prefetch on - duration: 284.952 ms
Sum: -951304192: with prefetch off - duration: 291.439 ms
Sum: -951304192: with prefetch on - duration: 290.690 ms
Sum: -951304192: with prefetch off - duration: 299.692 ms
Sum: -951304192: with prefetch on - duration: 295.287 ms
Sum: -951304192: with prefetch off - duration: 290.992 ms
Sum: -951304192: with prefetch on - duration: 285.116 ms
Sum: -951304192: with prefetch off - duration: 294.127 ms
Sum: -951304192: with prefetch on - duration: 286.986 ms
Sum: -951304192: with prefetch off - duration: 291.001 ms
Sum: -951304192: with prefetch on - duration: 283.233 ms
Sum: -951304192: with prefetch off - duration: 401.910 ms
Sum: -951304192: with prefetch on - duration: 297.021 ms
Sum: -951304192: with prefetch off - duration: 307.814 ms
Sum: -951304192: with prefetch on - duration: 287.201 ms
Sum: -951304192: with prefetch off - duration: 303.870 ms
Sum: -951304192: with prefetch on - duration: 286.962 ms
Sum: -951304192: with prefetch off - duration: 352.779 ms
Sum: -951304192: with prefetch on - duration: 283.245 ms
Sum: -951304192: with prefetch off - duration: 294.422 ms

looks like some speedup to me.

On Thu, 08 Dec 2005 Qingqing Zhou wrote :
>
> I found an interesting paper improving index speed by prefetching memory
> data to L1/L2 cache here (there is discussion about prefetching disk
> data to memory several days ago "ice-breaker thread"):
> http://www.cs.cmu.edu/~chensm/papers/index_pf_final.pdf
>
> Also related technique used to speedup memcpy:
> http://people.redhat.com/arjanv/pIII.c
>
> I wonder if we could use it to speed up in-memory scan opertion for heap
> or index. Tom's patch has made scan can handle a page (vs. row) every
> time, which is a basis for this optimization.
>
> I write a program try to simulate it, but I am not good at micro
> optimization, and I just get a very weak but kind-of-stable improvement. I
> wonder if any people here are interested to take a look.
>
> Regards,
> Qingqing
>
> ----------------------------------
>
> Test results
> --------------
> Cache line size: 64
> CPU: P4 2.4G
> $#./prefetch 10 16
> Sum: -951304192: with prefetch on - duration: 42.163 ms
> Sum: -951304192: with prefetch off - duration: 42.838 ms
> Sum: -951304192: with prefetch on - duration: 44.044 ms
> Sum: -951304192: with prefetch off - duration: 42.792 ms
> Sum: -951304192: with prefetch on - duration: 42.324 ms
> Sum: -951304192: with prefetch off - duration: 42.803 ms
> Sum: -951304192: with prefetch on - duration: 42.189 ms
> Sum: -951304192: with prefetch off - duration: 42.801 ms
> Sum: -951304192: with prefetch on - duration: 42.155 ms
> Sum: -951304192: with prefetch off - duration: 42.827 ms
> Sum: -951304192: with prefetch on - duration: 42.179 ms
> Sum: -951304192: with prefetch off - duration: 42.798 ms
> Sum: -951304192: with prefetch on - duration: 42.180 ms
> Sum: -951304192: with prefetch off - duration: 42.804 ms
> Sum: -951304192: with prefetch on - duration: 42.193 ms
> Sum: -951304192: with prefetch off - duration: 42.827 ms
> Sum: -951304192: with prefetch on - duration: 42.164 ms
> Sum: -951304192: with prefetch off - duration: 42.810 ms
> Sum: -951304192: with prefetch on - duration: 42.182 ms
> Sum: -951304192: with prefetch off - duration: 42.826 ms
>
> Test program
> ----------------
>
> /*
> * prefetch.c
> * PostgreSQL warm-cache sequential scan simulator with prefetch
> */
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <memory.h>
> #include <errno.h>
> #include <sys/time.h>
>
> typedef char bool;
> #define true ((bool) 1)
> #define false ((bool) 0)
>
> #define BLCKSZ 8192
> #define CACHESZ 64
> #define NBLCKS 5000
>
> int sum;
>
> int
> main(int argc, char *argv[])
> {
> int i, rounds;
> char *blocks;
> int cpu_cost;
>
> if (argc != 3)
> {
> fprintf(stderr, "usage: prefetch <rounds> <cpu_cost [1, 16]>\n");
> exit(-1);
> }
>
> rounds = atoi(argv[1]);
> cpu_cost = atoi(argv[2]);
> if (cpu_cost > 16)
> exit(-1);
>
> for (i = 0; i < 2*rounds; i++)
> {
> int j, k;
> struct timeval start_t, stop_t;
> bool enable = i%2?false:true;
> char *blck;
>
> blocks = (char *)malloc(BLCKSZ*NBLCKS);
> memset(blocks, 'a', BLCKSZ*NBLCKS);
>
> sum = 0;
> gettimeofday(&start_t, NULL);
>
> for (j = 0; j < NBLCKS; j++)
> {
> blck = blocks + j*BLCKSZ;
> for (k=0; k < BLCKSZ; k+=CACHESZ)
> {
> int *s = (int *)(blck + k);
> int u = cpu_cost;
>
> if (enable)
> {
> /* prefetch ahead */
> __asm__ __volatile__ (
> "1: prefetchnta 128(%0)\n"
> : : "r" (s) : "memory" );
> }
>
> /* pretend to process current tuple */
> while (u--) sum += (*(s+u))*(*(s+u));
> }
> }
> gettimeofday(&stop_t, NULL);
>
> free(blocks);
>
> /* measure the time */
> if (stop_t.tv_usec < start_t.tv_usec)
> {
> stop_t.tv_sec--;
> stop_t.tv_usec += 1000000;
> }
> fprintf (stdout, "Sum: %d: with prefetch %s - duration: %ld.%03ld ms\n",
> sum,
> enable?"on":"off",
> (long) ((stop_t.tv_sec - start_t.tv_sec) * 1000 +
> (stop_t.tv_usec - start_t.tv_usec) / 1000),
> (long) (stop_t.tv_usec - start_t.tv_usec) % 1000);
>
> }
>
> exit(0);
> }
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Luke Lonergan 2005-12-09 04:54:05 Re: Warm-cache prefetching
Previous Message Qingqing Zhou 2005-12-09 04:07:38 Warm-cache prefetching