Re: Vacuum: allow usage of more than 1GB of work mem

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Vacuum: allow usage of more than 1GB of work mem
Date: 2017-01-25 20:11:13
Message-ID: CAGTBQpa6=JZuQj1VySRwnrsvKx1yrrLB66fu5BpHtxHZvx_XBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 25, 2017 at 1:54 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> Thank you for updating the patch!
>
> + /*
> + * Quickly rule out by lower bound (should happen a lot) Upper bound was
> + * already checked by segment search
> + */
> + if (vac_cmp_itemptr((void *) itemptr, (void *) rseg->dead_tuples) < 0)
> + return false;
>
> I think that if the above result is 0, we can return true as itemptr
> matched lower bound item pointer in rseg->dead_tuples.

That's right. Possibly not a great speedup but... why not?

>
> +typedef struct DeadTuplesSegment
> +{
> + int num_dead_tuples; /* # of
> entries in the segment */
> + int max_dead_tuples; /* # of
> entries allocated in the segment */
> + ItemPointerData last_dead_tuple; /* Copy of the last
> dead tuple (unset
> +
> * until the segment is fully
> +
> * populated) */
> + unsigned short padding;
> + ItemPointer dead_tuples; /* Array of dead tuples */
> +} DeadTuplesSegment;
> +
> +typedef struct DeadTuplesMultiArray
> +{
> + int num_entries; /* current # of entries */
> + int max_entries; /* total # of slots
> that can be allocated in
> + * array */
> + int num_segs; /* number of
> dead tuple segments allocated */
> + int last_seg; /* last dead
> tuple segment with data (or 0) */
> + DeadTuplesSegment *dead_tuples; /* array of num_segs segments */
> +} DeadTuplesMultiArray;
>
> It's a matter of personal preference but some same dead_tuples
> variables having different meaning confused me.
> If we want to access first dead tuple location of first segment, we
> need to do 'vacrelstats->dead_tuples.dead_tuples.dead_tuples'. For
> example, 'vacrelstats->dead_tuples.dt_segment.dt_array' is better to
> me.

Yes, I can see how that could be confusing.

I went for vacrelstats->dead_tuples.dt_segments[i].dt_tids[j]

> + nseg->num_dead_tuples = 0;
> + nseg->max_dead_tuples = 0;
> + nseg->dead_tuples = NULL;
> + vacrelstats->dead_tuples.num_segs++;
> + }
> + seg = DeadTuplesCurrentSegment(vacrelstats);
> + }
> + vacrelstats->dead_tuples.last_seg++;
> + seg = DeadTuplesCurrentSegment(vacrelstats);
>
> Because seg is always set later I think the first line starting with
> "seg = ..." is not necessary. Thought?

That's correct.

Attached a v6 with those changes (and rebased).

Make check still passes.

Attachment Content-Type Size
0002-Vacuum-allow-using-more-than-1GB-work-mem-v6.patch text/x-patch 20.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2017-01-25 20:14:58 Re: Checksums by default?
Previous Message Jim Nasby 2017-01-25 20:06:30 Re: PoC plpgsql - possibility to force custom or generic plan