From: | panam <panam(at)gmx(dot)net> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Hash Anti Join performance degradation |
Date: | 2011-05-26 12:33:37 |
Message-ID: | 1306413217406-4428435.post@n5.nabble.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Hi there,
Kevin Grittner wrote:
>
>> Is there a way to determine the values actually used?
> The pg_settings view. Try the query shown here:
> http://wiki.postgresql.org/wiki/Server_Configuration
>
Thanks Kevin, very usful. Here is the output:
"version";"PostgreSQL 9.0.4, compiled by Visual C++ build 1500, 64-bit"
"bytea_output";"escape"
"client_encoding";"UNICODE"
"effective_cache_size";"4GB"
"lc_collate";"German_Germany.1252"
"lc_ctype";"German_Germany.1252"
"listen_addresses";"*"
"log_destination";"stderr"
"log_line_prefix";"%t "
"logging_collector";"on"
"max_connections";"100"
"max_stack_depth";"2MB"
"port";"5432"
"server_encoding";"UTF8"
"shared_buffers";"1GB"
"temp_buffers";"4096"
"TimeZone";"CET"
"work_mem";"1GB"
Craig Ringer wrote:
>
> On 05/26/2011 12:42 AM, panam wrote:
> It's a bit beyond me, but I suspect that it'd be best if you could hang
> onto the dump file in case someone has the time and enthusiasm to
> investigate it. I take it you can't distribute the dump file, even
> privately?
>
Fortunately, I managed to reduce it to the absolute minimum (i.e. only
meaningless ids), and the issue is still observable.
You can download it from here:
http://www.zumodrive.com/file/460997770?key=cIdeODVlNz
Some things to try:
* tune your psql settings if you want
* reindex, vaccum analzye if you want
"Patholgical" query:
select
b.id,
(SELECT
m1.id
FROM
message m1
LEFT JOIN
message m2
ON (
m1.box_id = m2.box_id
AND m1.id < m2.id
)
WHERE
m2.id IS NULL
AND m1.box_id = b.id)
from
box b
=> takes almost "forever" (~600 seconds on my system)
Try
delete from message where id > 2550000;
=> deletes 78404 rows
Do the "pathological" query again
=> speed is back (~4 seconds on my system)
Replay the dump
Try
delete from message where id < 1000000;
=> deletes 835844 (10 times than before) rows. Maybe you can delete many
more, I haven't tested this systematically.
Do the "pathological" query again
=> takes almost "forever" (didn't wait...)
Replay the dump
Cluster:
cluster message_pkey on message;
Do the "pathological" query again
=> speed is back (~3 seconds on my system)
Any third party confirmation?
Thanks
panam
--
View this message in context: http://postgresql.1045698.n5.nabble.com/Hash-Anti-Join-performance-degradation-tp4420974p4428435.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
From | Date | Subject | |
---|---|---|---|
Next Message | Pavan Deolasee | 2011-05-26 12:57:28 | Re: Proposal: Another attempt at vacuum improvements |
Previous Message | Peter Eisentraut | 2011-05-26 12:24:44 | Re: Should partial dumps include extensions? |
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2011-05-26 12:36:11 | Re: Speeding up loops in pl/pgsql function |
Previous Message | Merlin Moncure | 2011-05-26 12:11:28 | Re: Speeding up loops in pl/pgsql function |