RE: Recommendation to run vacuum FULL in parallel

From: Kevin Brannen <KBrannen(at)efji(dot)com>
To: pgsql-general General <pgsql-general(at)postgresql(dot)org>
Subject: RE: Recommendation to run vacuum FULL in parallel
Date: 2019-04-04 19:53:27
Message-ID: MN2PR19MB2575026251A1DF43BEDC5066A4500@MN2PR19MB2575.namprd19.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

From: Perumal Raj <perucinci(at)gmail(dot)com>

So conclude the requirement here , The only way to parallelism is multiple script. And no need to do REINDEX exclusively.
Question : Do we need to consider Table dependencies while preparing script in order to avoid table locks during vacuum full ?

We have a small bash script (see below) that get the list of tables and their sizes, sorted smallest to largest, and do “vacuum full” one at a time because (as someone else pointed out) this is very I/O intensive. That order also helps to ensure we finish because some of our installs are at the edge of running out of space (an issue we’re dealing with). I probably wouldn’t have a problem doing 2 at a time, but we do this in the middle of the night when activity is lowest and it only takes 1-2 hours, so we’re good with it. It sounds like you have a lot more data though.

You might also consider putting the data into different tablespaces which are spread over multiple disks to help I/O. If you can, use SSD drives, they help with speed quite a bit. 😊

Don’t worry about table dependencies. This is a physical operation, not a data operation.

HTH,
Kevin

$PGPATH/psql -t -c "
WITH s AS (SELECT nspname || '.' || relname AS TABLE_NAME, pg_total_relation_size(c.oid) AS total_bytes
FROM pg_class c
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE relkind = 'r' AND nspname NOT IN ( 'pg_catalog', 'information_schema' )
ORDER BY 2 )
SELECT table_name FROM s
" |
while read t ; do echo "" ; echo $t; $PGPATH/vacuumdb -w -z -f -t $t ; done
###
This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential information. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or attached to this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify us by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them to disk. Thank you.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andres Freund 2019-04-04 20:36:48 Re: CVE-2019-9193 about COPY FROM/TO PROGRAM
Previous Message Magnus Hagander 2019-04-04 19:50:41 Re: CVE-2019-9193 about COPY FROM/TO PROGRAM