Re: very slow when writing query to file

From: boris pezzatti <boris(dot)pezzatti(at)wsl(dot)ch>
To: pgadmin-support(at)postgresql(dot)org
Subject: Re: very slow when writing query to file
Date: 2011-11-02 08:21:40
Message-ID: 4EB0FD94.7010703@wsl.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgadmin-support


... or (I promise this is the last guess) each row is appended to the
file in the the loop, and on some OS's there is a problem keeping open
the file (so that each time the file must be opened again ... adding a
lot of extra time). Maybe creating all the "virtual file" in a variable
and storing all at once (or let say in chunks of 100 Mb) could solve the
problem ...

On 11/02/2011 09:10 AM, boris pezzatti wrote:
>
> or maybe the retrieved data in RAM are somehow lazy bound ... ?
>
>
>
> On 11/02/2011 08:58 AM, boris pezzatti wrote:
>> Thank you Fernando for reproducing this.
>> I suspect there must be some part of code in the
>> * for each row
>> * for each column
>> loops that result inefficient only on some machines or OS's (I'm
>> using Archlinux).
>> In fact the extra time I and Fernando get can not only be attributed
>> to adding commas, " and line feeds. The same file which I retrived in
>> more than one hour (32Mb), could be saved with openoffice in 10
>> seconds, changing column separators, text delimiters and encoding.
>>
>> Could it be possible that the use of e.g. error catching in the loops
>> results in poor performance on some systems? (I'm not a c++
>> programmer ...)
>>
>>
>>
>>
>>
>>
>>
>> On 11/01/2011 11:23 AM, Guillaume Lelarge wrote:
>>> On Mon, 2011-10-31 at 18:26 -0300, Fernando Hevia wrote:
>>>> [...]
>>>> I could reproduce the issue in a fresh Windows 7 install with no other apps
>>>> running other than pgAdmin v1.14.0.
>>>> > From what I could see, the execute-to-file function runs in 2 stages:
>>>> 1. Rows are retrieved from DB server to RAM
>>>> 2. Rows are written from RAM to file
>>>>
>>> That's right.
>>>
>>>> The delay clearly occurs in step 2.
>>> That's also right.
>>>
>>>> While with small datasets (<2000) the writing to disk delay is barely
>>>> perceivable, when the number of rows is incremented (>10k) it is quite
>>>> distinctive how step 1 keeps completing in the expected time frame but step
>>>> 2 takes much much longer. In any case it should be the other way around.
>>>>
>>> Nope, step 2 does a lot of work.
>>>
>>>> With 10k rows (2.3 MB) it took 3.5 seconds to retrieve data from DB and 40
>>>> seconds to write the file to an SATA 7200 disk with write-through cache.
>>>> With 100k rows (23 MB) the DB retrieve went for 35 seconds while the file
>>>> writing part took over 4 minutes (didn't wait for it to finish).
>>>>
>>> I don't have the same numbers:
>>>
>>> * 1000 : 1s
>>> * 10000 : 1s
>>> * 200000 : 7s
>>> * 1000000 : 18s
>>>
>>>> The file is being written at an avg 60 KB per second, which is extremely
>>>> slow.
>>> If the only thing pgAdmin does was writing, I would agree. But,
>>> actually, it does a lot more things:
>>>
>>> * for each row
>>> * for each column
>>> * adds the column separator, if needed
>>> * grabs one cell's value
>>> * quotes the value, if needed (which also means doubling the quote
>>> if it's within the value)
>>> * adds the line separator
>>> * converts it to the encoding, if needed
>>> * writes it to the file
>>>
>>> That could take some time.
>>>
>>> I searched if there were some parts that took much longer than others,
>>> but failed to find one.
>>>
>>>
>>
>> --
>> *Boris Pezzatti
>> Swiss Federal Research Institute WSL
>> *
>> Research unit Community Ecology
>> Team Insubric Ecosystems
>> via Belsoggiorno 22
>> CH-6500 Bellinzona
>> Switzerland phone direct ++41 91 821 52 32
>> phone ++41 91 821 52 30
>> fax ++41 91 821 52 39
>> boris(dot)pezzatti(at)wsl(dot)ch
>> http://www.wsl.ch <http://www.wsl.ch/>
>>
>>
>
> --
> *Boris Pezzatti
> Swiss Federal Research Institute WSL
> *
> Research unit Community Ecology
> Team Insubric Ecosystems
> via Belsoggiorno 22
> CH-6500 Bellinzona
> Switzerland phone direct ++41 91 821 52 32
> phone ++41 91 821 52 30
> fax ++41 91 821 52 39
> boris(dot)pezzatti(at)wsl(dot)ch
> http://www.wsl.ch <http://www.wsl.ch/>
>
>

--
*Boris Pezzatti
Swiss Federal Research Institute WSL
*
Research unit Community Ecology
Team Insubric Ecosystems
via Belsoggiorno 22
CH-6500 Bellinzona
Switzerland phone direct ++41 91 821 52 32
phone ++41 91 821 52 30
fax ++41 91 821 52 39
boris(dot)pezzatti(at)wsl(dot)ch
http://www.wsl.ch <http://www.wsl.ch/>

In response to

Responses

Browse pgadmin-support by date

  From Date Subject
Next Message Guillaume Lelarge 2011-11-02 21:25:32 Re: very slow when writing query to file
Previous Message boris pezzatti 2011-11-02 08:10:21 Re: very slow when writing query to file