Re: Join slow on "large" tables

From: Josué Maldonado <josue(at)lamundial(dot)hn>
To: Scott Marlowe <smarlowe(at)qwest(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Join slow on "large" tables
Date: 2004-06-08 14:36:02
Message-ID: 40C5CED2.3020002@lamundial.hn
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello Scott,

El 07/06/2004 4:47 PM, Scott Marlowe en su mensaje escribio:

> OK, you have to ask yourself a question here. Do I have enough memory
> to let both postgresql and the kernel to cache this data, or enough
> memory for only one. Then, you pick one and try it out. But there's
> some issues here. PostgreSQL's shared buffer are not, and should not
> generally be thought of as "cache". A cache's job it to hold the whole
> working set, or as much as possible, ready for access. A buffer's job
> is to hold all the data we're tossing around right this second. Once
> we're done with the data, the buffers can and do just drop whatever was
> in them. PostgreSQL does not have caching, in the classical sense.
> that may or may not change.
>
> The kernel, on the other hand, has both cache and buffer. Ever notice
> that a Linux top shows the cache usually being much bigger than the
> buffers? My 512 Meg home box right now has 252968k for cache, and
> 43276k for buffers.

I noticed buffers are lower agains cache at least as top shows, dunno if
I'm wrong:

8:28am up 1:00, 2 users, load average: 0.40, 0.97, 0.75
65 processes: 64 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle
CPU1 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle
CPU2 states: 0.0% user, 0.1% system, 0.0% nice, 99.4% idle
CPU3 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle
Mem: 2069596K av, 1882228K used, 187368K free, 0K shrd, 32924K
buff
Swap: 2096440K av, 0K used, 2096440K free 1757220K
cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
1508 root 13 0 1040 1040 856 R 0.1 0.0 0:00 top
1 root 8 0 476 476 432 S 0.0 0.0 0:04 init

> Now, you're tossing around enough data to actually maybe have a use for
> a huge set of buffers, but this means you'll need to starve your cache
> to get enough buffers. Which means that if one process does this kind
> of join, drops connection, and two seconds later, another process
> connects and does nearly the same thing, it's likely to have to read it
> all from the hard drives again, as it's not in the postgresql buffer,
> and not in the kernel cache.
>
> Starting a seperate connection, doing a simple select * from table1;
> sekect * from table 2, dropping the result set returned, and staying
> connected seems to be enough to get 7.4 to hold onto the data.
>
> PostgreSQL's current buffer management algo is dirt simple. The ones in
> the kernel's cache are quite good. So you can quickly reach a point
> where PostgreSQL is chasing it's tail where the kernel would have done
> OK.
>
> Your numbers show that you are tossing 659M and 314M against each other,
> but I don't know if you're harvesting the whole set at once, or just a
> couple row of each. Indexing help, or is this always gonna be a big seq
> scan of 90% of both tables?

Generally only a small set is queried, the bigest record set expected is
about 24,000 rows and does not exced the 10MB size, explain analyze
shows the planner is using the index as expected but performance still poor.

> If you are getting the whole thing all the time, and want postgresql to
> buffer the whole thing (I recommend against it, although a very few
> circumstances seem to support it) you need to have 973M of buffer. That
> would be 124544 or we'll just call it 130000. This high of a number
> means you will be getting more than 50% of the RAM for postgreSQL. At
> that point, it seems you might as well go for broke and grab most of it,
> ~200000 or so.
>
> If you're not always mushing the two things against each other, and
> you've got other datasets to interact with, index it.
>
> Oh, in your reply you might to include an explain analyze of the query,
> and maybe an output of top while the query is running.
>

dbmund=# explain analyze select * from vkardex where kprocode='1013';
Nested Loop (cost=0.00..2248.19 rows=403 width=114) (actual
time=846.318..16030.633 rows=3145 loops=1)
-> Index Scan using pkd_pcode_idx on pkardex (cost=0.00..806.27
rows=403 width=72) (actual time=0.054..87.393 rows=3544 loops=1)
Index Cond: ((pkd_pcode)::text = '1013'::text)
-> Index Scan using pdc_pk_idx on pmdoc (cost=0.00..3.55 rows=1
width=50) (actual time=4.482..4.484 rows=1 loops=3544)
Index Cond: (pmdoc.pdc_pk = "outer".doctofk)
Total runtime: 16033.807 ms
(6 rows)

At the time the querie was running top returned:
5:11pm up 1:28, 3 users, load average: 0.19, 0.97, 1.41
69 processes: 66 sleeping, 1 running, 2 zombie, 0 stopped
CPU0 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle
CPU1 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle
CPU2 states: 0.1% user, 0.4% system, 0.0% nice, 98.4% idle
CPU3 states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle
Mem: 2069596K av, 1477784K used, 591812K free, 0K shrd, 2336K
buff
Swap: 2096440K av, 9028K used, 2087412K free 1388372K
cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
1225 postgres 17 0 257M 257M 255M S 0.6 12.7 7:14 postmaster
1978 postgres 11 0 1044 1044 860 R 0.2 0.0 0:00 top
1 root 9 0 472 444 428 S 0.0 0.0 0:04 init
2 root 8 0 0 0 0 SW 0.0 0.0 0:00 keventd

and free returned:
/root: free
total used free shared buffers cached
Mem: 2069596 1477832 591764 0 2320 1388372
-/+ buffers/cache: 87140 1982456
Swap: 2096440 9028 2087412

Thanks,

--
Sinceramente,
Josué Maldonado.
"El verdadero placer está en la búsqueda, más que en la explicación." --
Isaac Asimov

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2004-06-08 14:54:11 Re: Use of Functional Indexs and Planner estimates
Previous Message Tom Lane 2004-06-08 14:33:30 Re: Use of Functional Indexs and Planner estimates