Re: Shaky coding for vacuuming partitioned relations

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Shaky coding for vacuuming partitioned relations
Date: 2017-09-25 09:37:20
Message-ID: CAB7nPqSHAqoEGuci_8UzFUzrfpSZzjq97nga4DrZSZUbTj-BOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 25, 2017 at 4:42 PM, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> On 2017/09/25 12:10, Michael Paquier wrote:
> Hmm, I'm not sure if we need to lock the partitions, too. Locks taken by
> find_all_inheritors() will be gone once the already-running transaction is
> ended by the caller (vacuum()). get_rel_oids() should just lock the table
> mentioned in the command (the parent table), so that find_all_inheritors()
> returns the correct partition list, as Tom perhaps alluded to when he said
> "it would also make the find_all_inheritors call a lot more meaningful.",
> but...

It is important to hold the lock for child tables until the end of the
transaction actually to fill in correctly the RangeVar associated to
each partition when scanning them.

> When vacuum() iterates that list and calls vacuum_rel() for each relation
> in the list, it might be the case that a relation in that list is no
> longer a partition of the parent. So, one might say that it would get
> vacuumed unnecessarily. Perhaps that's fine?

I am personally fine with that. Any checks would prove to complicate
the code for not much additional value.

>> I am not
>> saying that this needs to be fixed in REL_10_STABLE at this stage
>> though as this would require some refactoring similar to what the
>> patch adding support for VACUUM with multiple relations does.
>
> I think it would be better to fix that independently somehow. VACUUM on
> partitioned tables is handled by calling vacuum_rel() on individual
> partitions. It would be nice if vacuum_rel() didn't have to refer to the
> RangeVar. Could we not use get_rel_name(relid) in place of
> relation->relname? I haven't seen the other patch yet though, so maybe
> I'm missing something.

The RangeVar is used for error reporting, and once you reach
vacuum_rel, the relation and its OID may be gone. So you need to save
the information of the RangeVar beforehand when scanning the
relations. That's one reason behind having the VacuumRelation stuff
discussed on the thread for multiple table VACUUM, this way you store
for each item to be vacuumed:
- A RangeVar.
- A list of columns.
- An OID saved by vacuum deduced from the RangeVar.
That's quite some refactoring, so my opinion is that this ship has
sailed already for REL_10_STABLE, and that we had better live with
that in 10.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-09-25 10:04:36 Re: path toward faster partition pruning
Previous Message Rafia Sabih 2017-09-25 09:13:00 Re: Effect of changing the value for PARALLEL_TUPLE_QUEUE_SIZE