Re: parallel pg_restore - WIP patch

From: Russell Smith <mr-russ(at)pws(dot)com(dot)au>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallel pg_restore - WIP patch
Date: 2008-09-26 12:56:24
Message-ID: 48DCDBF8.1020501@pws.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan wrote:
>> Do we know why we experience "tuple concurrently updated" errors if we
>> spawn thread too fast?
>>
>
> No. That's an open item.

Okay, I'll see if I can have a little more of a look into it. No
promises as the restore the restore isn't playing nicely.
>
>>
>> the memory context is shared across all threads. Which means that it's
>> possible the memory contexts are stomping on each other. My GDB skills
>> are now up to being able to reproduce this in a gdb session as there are
>> forks going on all over the place. And if you process them in a serial
>> fashion, there aren't any errors. I'm not sure of the fix for this.
>> But in a parallel environment it doesn't seem possible to store the
>> memory context in the AH.
>>
>
>
> There are no threads, hence nothing is shared. fork() create s new
> process, not a new thread, and all they share are file descriptors.
>
> However, there does seem to be something odd happening with the
> compression lib, which I will investigate. Thanks for the report.

I'm sorry, I meant processes there. I'm aware there are no threads.
But my feeling was that when you forked with open files you got all of
the open file properties, including positions, and as you dupped the
descriptor, you share all that it's pointing to with every other copy of
the descriptor. My brief research on that shows that in 2005 there was
a kernel mailing list discussion on this issue.
http://mail.nl.linux.org/kernelnewbies/2005-09/msg00479.html was quite
informative for me. I again could be wrong but worth a read. If it is
true, then the file needs to be reopened by each child, it can't use the
duplicated descriptor. I haven't had a change to implementation test is
as it's late here. But I'd take a stab that it will solve the
compression library problems.

I hope this helps, not hinders

Russell.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zdenek Kotala 2008-09-26 13:04:51 Re: FSM, now without WAL-logging
Previous Message Zeugswetter Andreas OSB sIT 2008-09-26 12:43:42 Re: Updates of SE-PostgreSQL 8.4devel patches