Skip site navigation (1) Skip section navigation (2)

Re: BUG #5412: test case produced, possible race condition.

From: Rusty Conover <rconover(at)infogears(dot)com>
To:
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5412: test case produced, possible race condition.
Date: 2010-04-14 04:28:00
Message-ID: 99EABEAB-D3C1-4EF7-A958-639317F8778C@infogears.com (view raw or flat)
Thread:
Lists: pgsql-bugspgsql-hackers
Hi Heikki and everybody else,

It seems like this is a race condition cause by the system catalog cache not being locked properly. I've included a perl script below that causes the crash on my box consistently.

The script forks two different types of processes:

#1 - begin transaction, create a few temp tables and analyze them in a transaction, commit (running in database foobar_1)
#2 - begin transaction, truncate table, insert records into table from select in a transaction, commit (running in database foobar_2)

I setup the process to have 10 instances of task #1 and 1 instance of task #2.

Running this script causes the crash of postgres within seconds on my box.

If you change the parameters to say <6 of task #1, no crash happens, but if you have >7 the crash does happen.

The box that I'm running the script on has 8 cores, so CPU contention and some improper locking might cause some of the problem. 

The specs of the box are:

Fedora release 10 (Cambridge)
Intel(R) Xeon(R) CPU           E5420  @ 2.50GHz
glibc-2.9-3.i686
Linux 2.6.27.30-170.2.82.fc10.i686.PAE #1 SMP Mon Aug 17 08:24:23 EDT 2009 i686 i686 i386 GNU/Linux
PostgreSQL: 8.4.3

I tried to reproduce it on one of my 16-core x64 boxes and the same crash doesn't occur,  also I tried on a dual core box and couldn't get a crash but I haven't exhaustively tested the right number of parameters for task #1.

If the script doesn't cause a crash for you please try changing the variable $total_job_1_children to be a greater number then the number of CPU cores of the machine that you're running it on.

Any help would be appreciated and if I can be of further assistance please let me know,

Rusty
--
Rusty Conover
rconover(at)infogears(dot)com
InfoGears Inc / GearBuyer.com / FootwearBuyer.com
http://www.infogears.com
http://www.gearbuyer.com
http://www.footwearbuyer.com


#!/usr/bin/perl
use strict;
use warnings;
use DBI;
use POSIX ":sys_wait_h";


# Number of children for job1, create temp tables and analyze them

# The number of jobs here matters: (on my 8 core box you need to have some contention to get a failure)
# >11=fail
# 10=fail
# 9=fail
# 8=fail
# 7=fail
# <6 works,
my $total_job_1_children = 11;

# Number of children for job 2 run a truncate and insert query loop.
# we only need one of these jobs to be running really, because the truncate locks.
my $total_job_2_children = 1;


# Just need two databases on your machine, foobar_1 and foobar_2 are the defaults.
my $database_1_dsn = ['dbi:Pg:dbname=foobar_1', 'postgres'];
my $database_2_dsn = ['dbi:Pg:dbname=foobar_2', 'postgres'];

# Do some setup transactions.
if(1) {
 my $dbh = DBI->connect(@$database_2_dsn);

 $dbh->do("drop table foo_dest");
 $dbh->do("drop table foobar_source");
 $dbh->begin_work();
 eval {
   $dbh->do("create table foobar_source (id integer, name text, size integer)") || die("Failed to create foobar_source: " . $dbh->errstr());
   for(my $k = 0; $k < 3500; $k++) {
     $dbh->do("insert into foobar_source (id, name, size) values (?, 'test me', ?)", undef, $k, int(rand(400000))) || die("Failed to insert into foobar_source: " . $dbh->errstr());
   }

   $dbh->do("analyze foobar_source");

   $dbh->do("create table foo_dest (id integer, name text, size integer)");
 };
 if($@) {
   print "Error doing init of tables: " . $@ . "\n";
   $dbh->rollback();
   $dbh->disconnect();
   exit(0);
 }

 $dbh->commit();
 $dbh->disconnect();
}



my @child_pids;

for(my $i =0; $i < $total_job_1_children; $i++) {
 print "Forking\n";
 my $pid = fork();
 if($pid == 0) {
   run_child('job1');
   exit(0);
 } else {
   push @child_pids, $pid;
 }
}

for(my $i =0; $i < $total_job_2_children; $i++) {
 print "Forking\n";
 my $pid = fork();
 if($pid == 0) {
   run_child('job2');
   exit(0);
 } else {
   push @child_pids, $pid;
 }
}


foreach my $pid (@child_pids) {
 print "Waiting for $pid\n";
 waitpid($pid, 0);
 print "Got it\n";
}
exit(0);


sub run_child {
 my $job_type = shift;
 my $dsn;
 if($job_type eq 'job1') {
   $dsn = $database_1_dsn;
 } else {
   $dsn = $database_2_dsn;
 }
 my $dbh = DBI->connect(@$dsn);
 defined($dbh) || die("Failed to get connection to database");

 for(my $i =0; $i < 400; $i++) {

   $dbh->begin_work();
   eval {

     if($job_type eq 'job1') {
	$dbh->{Warn} = 0;
	$dbh->do("create temp table c_products (id INTEGER NOT NULL, product_name_stemmed text, average_price numeric(12,2), cset_bitmap bit(437), gender text) WITHOUT OIDS ON COMMIT DROP");
	$dbh->do("create temp table c_products_oids (c_products_id INTEGER NOT NULL, oid INTEGER NOT NULL UNIQUE, price numeric(12,2) not null, product_name_stemmed text not null) WITHOUT OIDS ON COMMIT DROP");
	$dbh->{Warn} = 1;
	
	$dbh->do("analyze c_products");
	$dbh->do("analyze c_products_oids");
     } else {
	$dbh->do("truncate table foo_dest");
	$dbh->do("insert into foo_dest (id, name, size) select id, name, size from foobar_source");
     }
   };
   if($@) {
     print "Got error in job $job_type: $(at)\n";
     $dbh->rollback();
     $dbh->disconnect();
     exit(0);
   }

   $dbh->commit();
 }
 $dbh->disconnect();
 print "Child finished\n";
 return;
}









In response to

Responses

pgsql-hackers by date

Next:From: Greg SmithDate: 2010-04-14 04:55:25
Subject: Re: psql's \d display of unique index vs. constraint
Previous:From: Josh KupershmidtDate: 2010-04-14 02:19:11
Subject: Re: psql's \d display of unique index vs. constraint

pgsql-bugs by date

Next:From: Pavel StehuleDate: 2010-04-14 06:36:25
Subject: Re: BUG #5419: Default parameters in PLPGSQL functions skipping every other value in pgAdmin view
Previous:From: Tom LaneDate: 2010-04-14 04:00:54
Subject: Re: BUG #5419: Default parameters in PLPGSQL functions skipping every other value in pgAdmin view

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group