Pg_upgrade faster, again!

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Pg_upgrade faster, again!
Date: 2012-12-22 23:13:20
Message-ID: 20121222231320.GA30566@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I promised to research allowing parallel execution of schema
dump/restore, so I have developed the attached patch, with dramatic
results:

tables git patch
1000 22.29 18.30
2000 30.75 19.67
4000 46.33 22.31
8000 81.09 29.27
16000 145.43 40.12
32000 309.39 64.85
64000 754.62 108.76

These performance results are best-case because it was run with the the
databases all the same size and equal to the number of server cores.
(Test script attached.)

This uses fork/processes on Unix, and threads on Windows. I need
someone to check my use of waitpid() on Unix, and I need code compile
and run testing on Windows.

It basically adds a --jobs option, like pg_restore uses, to run multiple
schema dumps/restores in parallel. I patterned this after the
pg_restore pg_backup_archiver.c --jobs code. However, I found the
pg_restore Windows code awkward because it puts everything in one struct
array that has gaps for dead children. Because WaitForMultipleObjects()
requires an array of thread handles with no gaps, the pg_restore code
must make a temporary array for every call to WaitForMultipleObjects().

Instead, I created an array just for thread handles (rather than putting
it in the same struct), and swapped entries into dead child slots to
avoid gaps --- this allows the thread handle array to be passed directly
to WaitForMultipleObjects().

Do people like this approach? Should we do the same in pg_restore. I
expect us to be doing more parallelism in other areas so I would like to
have an consistent approach.

The only other optimization I can think of is to do parallel file copy
per tablespace (in non-link mode).

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachment Content-Type Size
jobs.diff text/x-diff 17.2 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2012-12-23 00:49:00 Re: pgcrypto seeding problem when ssl=on
Previous Message Kevin Grittner 2012-12-22 20:13:41 Re: Review of Row Level Security