Skip site navigation (1) Skip section navigation (2)

Re: parallel pg_restore - WIP patch

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Joshua Drake <jd(at)commandprompt(dot)com>
Cc: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>, Russell Smith <mr-russ(at)pws(dot)com(dot)au>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeffrey Baker <jwbaker(at)gmail(dot)com>
Subject: Re: parallel pg_restore - WIP patch
Date: 2008-09-26 22:23:30
Message-ID: 48DD60E2.5080708@dunslane.net (view raw or flat)
Thread:
Lists: pgsql-hackers

Joshua Drake wrote:
> On Fri, 26 Sep 2008 17:10:44 -0400
> Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
>   
>> Yes, there are several funny things going on, including some stuff
>> with dependencies. I'll have a new patch tomorrow with luck. Thanks
>> for testing.
>>     
>
> O.k. I took at look at the patch itself and although I don't understand
> all of it there were a couple of red flags to me:
>
> + if (ropt->create)
> + 		die_horribly(AH,modulename,
> + 					 "parallel restore is
> incompatible with --create\n");
> + 
>   
> This seems like an odd limitation. In my mind, the schema would not be
> restored in parallel. The schema before data would restore as a single
> thread. Even the largest schemas would only take minutes (if that).
> Thus something like --create should never be a problem.
>   


Originally I had everything restoring in parallel. Now I am in fact (as 
the patch should have showed you) restoring the first part in a single 
thread like you say. Thus I probably can relax that restriction. I will 
look and see.

> I also noticed you check if we have zlib? Is it even possible to use
> the c format without it? (that would be new to me).
>
> I noticed this line:
>
>
> + 	while((next_work_item = get_next_work_item(AH)) != NULL)
> + 	{
> + 		/* XXX need to improve this test in case there is no
> table data */
> + 		/* need to test for indexes, FKs, PK, Unique, etc */
> + 		if(strcmp(next_work_item->desc,"TABLE DATA") == 0)
> + 			break;
> + 		(void) _restore_one_te(AH, next_work_item, ropt,
> false);
> + 
> + 		next_work_item->prestored = true;
> + 
> + 		_reduce_dependencies(AH,next_work_item);
> + 	}
>
>
> Intead of the TABLE DATA compare, perhaps it makes sense to back patch
> pg_dump to have a line delimiter in the TOC? That way even if there is
> no TABLE DATA there would be a delimiter that says:
>
> --- BEGIN TABLE DATA
> --- END TABLE DATA
>
> Thus if nothing is there... nothing is there?
>   

The TOC isn't stored as a text file. So we'll need to look by entry 
tags. It's no big deal - there aren't a huge number.

> + 			/* delay just long enough betweek forks to
> give the catalog some
> + 			 * breathing space. Without this sleep I got 
> + 			 * "tuple concurrently updated" errors.
> + 			 */
> + 			pg_usleep(500000);
> + 			continue; /* in case the slots are not yet
> full */
> + 		}
>
> Could that be solved with a lock instead? Once the lock is released....
>   


That sleep is now gone.


> Anyway... just some thoughts. I apologize if I misunderstood the patch.
>
>
>   


No problem. Thanks for looking.

cheers

andrew

In response to

pgsql-hackers by date

Next:From: Tom LaneDate: 2008-09-26 22:23:56
Subject: Re: Updates of SE-PostgreSQL 8.4devel patches
Previous:From: Bruce MomjianDate: 2008-09-26 22:15:46
Subject: Re: Updates of SE-PostgreSQL 8.4devel patches

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group