Re: strange performance problem

From: Linos <info(at)linos(dot)es>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: strange performance problem
Date: 2009-02-28 19:57:12
Message-ID: 49A99718.4060305@linos.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Linos escribió:
> Richard Huxton escribió:
>> Linos wrote:
>>> Richard Huxton escribió:
>>>> Linos wrote:
>>>>> 2009-02-27 13:51:15 CET 127.0.0.1LOG: duración: 4231.045 ms
>>>>> sentencia:
>>>>> SELECT "nombre", "subfamilia_id", "id_familia", "hasta", "foto",
>>>>> "id_seccion", "id_categoria" FROM "modelo_subfamilia"
>>>>> PSQL with \timing:
>>>>> -development: Time: 72,441 ms
>>>>> -server: Time: 78,762 ms
>>>>> but if i load it from QT or from pgadmin i get more than 4 seconds in
>>>>> server and ~100ms in develoment machime, if i try the query without
>>>>> the
>>>>> "foto" column i get 2ms in development and 30ms in server
>>>> OK, so:
>>>> 1. No "foto" - both quick
>>>> 2. psql + "foto" - both slow
>>>> 3. QT + "foto" - slow only on server
>>> 1.No "foto" -both quick but still a noticeable
>>> difference between them 2ms develoment - 30ms server
>>> 2. psql + "foto" -both quick really, they are about 70ms,
>>> not bad giving that foto are bytea with small png images.
>>
>> Ah, sorry - I read the psql timings as 72 thousand ms (72 seconds) - of
>> course you're using European decimal marks.
>>
>>> 3. QT or WXWindows + "foto" -slow only one server yes.
>>>
>>>> The bit that puzzles me is why both are slow in #2 and not in #3.
>>
>> OK - well, the fact that both psql are fast means there's nothing too
>> wrong with your setup. It must be something to do with the application
>> libraries.
>>
>>> After the vacuum full verbose and reindex still the same problem (i had
>>> tried the vacuum before).
>>
>> OK. Worth ruling it out.
>>
>>> 1- The same in the two machines, tcp/ip with localhost.
>>
>> Hmm...
>>
>>> 2- I am exactly the same code in the two machines and the same pgadmin3
>>> version too.
>>
>> Good. We can rule that out.
>>
>>> 3- Ever the entire result set.
>>
>> Good.
>>
>>> 4- I am using es_ES.UTF8 in the two machines
>>
>> Good.
>>
>>> What can be using wxwindows and QT to access postgresql that psql it is
>>> not using, libpq?
>>
>> Well, I'm pretty sure that pgadmin will be using libpq at some level,
>> even if there is other code above it.
>>
>> Either:
>>
>> 1. One machine (the fast one) is actually using unix sockets and not
>> tcp/ip+localhost like you think.
>> 2. The networking setup is different on each.
>> 3. Something your code is doing with the bytea data is slower on one
>> machine than another. I seem to remember that pgadmin used to be quite
>> slow at displaying large amounts of data. They did some work on that,
>> but it might be that your use-case still suffers from it.
>>
>> For #1 try the psql test again, but with "-h localhost" and "-h /tmp"
>> (or whatever directory your unix socket is in - might be
>> /var/run/postgresql or similar too).
>
> Ok, thanks for the trick now i know where to search, after trying with
> -h localhost psql it is slow too in the server from 80,361 with \timing
> to 4259,453 using -h localhost. Any ideas what can be the problem here?
> i am going to make what you suggest and capture analyze the traffic,
> after find the hole i have tried in other debian server with the same
> kernel 2.6.26 and i have the same problem (my development machine it is
> Arch Linux with 2.6.28).
>
> Regards,
> Miguel Angel.
>
>> For #2, you can always try timing "psql -h localhost ... > /dev/null" on
>> both machines. If you capture port 5432 with something like "tcpdump -w
>> ip.dump host localhost and port 5432" you can then use wireshark to see
>> exactly why it's slow.
>>
>> For #3, I guess you'd need to reduce your code to just fetching the data
>> and time that. You may have already done this of course.
>>
>> HTH
>>

I have been testing with tcpdump but i dont see the problem in the traffic
(aside from the fact that it gives big jumps in ms between packets of data, but
i dont know why), i have tested on other debian machines with the same result, i
have upgraded kernel to 2.6.28 and postgresql to 8.3.6 (equal versions of my
Arch Linux Development machine), but still have the same problem:

-query with \timing with "psql -d database" ~110ms
-query with \timing with "psql -d database -h localhost" ~4400ms

Using tcp the cpu of postgresql spike to the max it can borrow within the query.
I have attached the tcpdump logs of a debian machine and the Arch too (maybe
anyone can see anything in them that i can not). How i can test pure speed in
the loopback interface? i have tried iperf but seems to be cpu bound so maybe
the results are misleading.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Linos 2009-03-01 00:50:59 Re: strange performance problem
Previous Message Gauthier, Dave 2009-02-27 22:04:59 Re: max_fsm_relations