From: | PFC <lists(at)boutiquenumerique(dot)com> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Cc: | Miroslav Šulc <miroslav(dot)sulc(at)startnet(dot)cz>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: [PERFORM] Avoiding tuple construction/deconstruction during joining |
Date: | 2005-03-15 19:53:13 |
Message-ID: | opsno2uzlzth1vuj@localhost |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
On my machine (Laptop with Pentium-M 1.6 GHz and 512MB DDR333) I get the
following timings :
Big Joins Query will all the fields and no order by (I just put a SELECT
* in the first table) yielding about 6k rows :
=> 12136.338 ms
Replacing the SELECT * from the table with many fields by just a SELECT
of the foreign key columns :
=> 1874.612 ms
I felt like playing a bit so I implemented a hash join in python
(download the file, it works on Miroslav's data) :
All timings do not include time to fetch the data from the database.
Fetching all the tables takes about 1.1 secs.
* With something that looks like the current implementation (copying
tuples around) and fetching all the fields from the big table :
=> Fetching all the tables : 1.1 secs.
=> Joining : 4.3 secs
* Fetching only the integer fields
=> Fetching all the tables : 0.4 secs.
=> Joining : 1.7 secs
* A smarter join which copies nothing and updates the rows as they are
processed, adding fields :
=> Fetching all the tables : 1.1 secs.
=> Joining : 0.4 secs
With the just-in-time compiler activated, it goes down to about 0.25
seconds.
First thing, this confirms what Tom said.
It also means that doing this query in the application can be a lot
faster than doing it in postgres including fetching all of the tables.
There's a problem somewhere ! It should be the other way around ! The
python mappings (dictionaries : { key : value } ) are optimized like crazy
but they store column names for each row. And it's a dynamic script
language ! Argh.
Note : run the program like this :
python test.py |less -S
So that the time spent scrolling your terminal does not spoil the
measurements.
Download test program :
http://boutiquenumerique.com/pf/miroslav/test.py
From | Date | Subject | |
---|---|---|---|
Next Message | Russell Smith | 2005-03-15 21:02:39 | Re: type unknown - how important is it? |
Previous Message | Dave Cramer | 2005-03-15 18:17:19 | Re: type unknown - how important is it? |
From | Date | Subject | |
---|---|---|---|
Next Message | David Gagnon | 2005-03-15 21:24:17 | Performance problem on delete from for 10k rows. May takes 20 minutes through JDBC interface |
Previous Message | PFC | 2005-03-15 17:17:31 | Re: [PERFORM] Avoiding tuple construction/deconstruction during joining |