Re: parallel restore vs. windows

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: parallel restore vs. windows
Date: 2008-12-17 12:53:50
Message-ID: 4948F65E.4020504@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

ITAGAKI Takahiro wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
>
>> I did this, but it turned out that the problem was a logic error that I
>> found once I had managed to get a working debugger. However, the Windows
>> thread code should now be more robust, so thanks to Andrew and Magnus
>> for the suggestions.
>>
>
> Hello, I tested parallel restore on Windows.
> I have some random comments about it:
>

Thanks for this
> * Two compiler warnings.
> pg_backup_custom.c: In function `_PrintTocData':
> pg_backup_custom.c:437: warning: unused variable `ctx'
> pg_backup_custom.c: In function `_ReopenArchive':
> pg_backup_custom.c:849: warning: unused variable `ctx'
>

Will be fixed in code cleanup
> * No description about new options in pg_restore --help.
> There are no help messages about multi-thread (-m) and
> truncate-before-load options.
>

Will fix
> * multi-thread option is ignored if --data-only is on.
> Is it an intended behavior? Even if so, we'd better to have
> warning messages here.
>

Not intended, unless my memory is fading. I will check.
> * Threads, forked processes and connections are disposed per entry.
> I think it's a designed behavior, but there might be room for
> improvement. The present implementation is slower when there
> are many small objects. If we can specialize in thread-based
> implementation, thread pooling and connections pooling are
> typically used in the context. -- it might be a ToDo item in 8.5.
>

Yes. I only got threading working at all just a few days ago. I think
your suggestion is a good one, and we should probably converge on a
threaded implementation and then look at using pooling. However, as you
say that would be work for the 8.5 timeframe.

> ----
> I have no idea about performance because I don't have multi-core
> machine for windows. Parallel restore seems to be slower than
> serial restore on single-cpu machine.
>

Not surprising. There is extra connection, worker setup/breakdown,
dependency housekeeping and context switching involved. However, I'd be
surprised if the overhead were huge.

cheers

andrew

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-12-17 13:41:01 Re: visibility maps
Previous Message Zdenek Kotala 2008-12-17 12:47:01 Re: Visibility map and freezing