Re: patch review : Add ability to constrain backend temporary file space

From: Mark Kirkwood <mark(dot)kirkwood(at)catalyst(dot)net(dot)nz>
To: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: patch review : Add ability to constrain backend temporary file space
Date: 2011-06-01 23:35:55
Message-ID: 4DE6CCDB.9090608@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/06/11 09:24, Cédric Villemain wrote:
>
> Submission review
> ================
>
> * The patch is not in context diff format.
> * The patch apply, but contains some extra whitespace.
> * Documentation is here but not explicit about 'temp tables',
> maybe worth adding that this won't limit temporary table size ?
> * There is no test provided. One can be expected to check that the
> feature work.
>

I've created a new patch (attached) with 'git diff -c' so hopefully the
format is ok now. I've added a paragraph about how temporary *table*
storage is not constrained, only temp work files. I made the point that
the *query* that creates a temp table will have its work files
constrained too. I've added a test too.

The major visible change is that the guc has been renamed to
'work_disk', to gel better with the idea that it is the disk spill for
'work_mem'.

> Code review
> =========
>
> * in fd.c, I think that "temporary_files_size -=
> (double)vfdP->fileSize;" should be done later in the function once we
> have successfully unlink the file, not before.
>

Agreed, I've moved it.

> * I am not sure it is better to add a fileSize like you did or use
> relationgetnumberofblock() when file is about to be truncated or
> unlinked, this way the seekPos should be enough to increase the global
> counter.
>

The temp files are not relations so I'd have to call stat I guess. Now
truncate/unlink can happen quite a lot (e.g hash with many files) and I
wanted to avoid adding too many library calls to this code for
performance reasons, so on balance I'm thinking it is gonna be more
efficient to remember the size in the Vfd.

> * temporary_files_size, I think it is better to have a number of pages
> à la postgresql than a kilobyte size
>

The temp file related stuff is worked on in bytes or kbytes in the fd.c
and other code (e.g log_temp_files), so it seems more natural to stay in
kbytes. Also work_disk is really only a spillover from work_mem (in
kbytes) so seems logical to match its units.

> * max_temp_files_size, I'll prefer an approach like shared_buffers
> GUC: you can use pages, or KB, MB, ...
>

We can use KB, MB, GB - just not pages, again like work_mem. These files
are not relations so I'm not sure pages is an entirely appropriate unit
for them.

> Simple Feature test
> ==============
>
> either explain buffers is wrong or the patch is wrong:
> cedric=# explain (analyze,buffers) select * from foo order by 1 desc ;
> QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------
> Sort (cost=10260.02..10495.82 rows=94320 width=4) (actual
> time=364.373..518.940 rows=100000 loops=1)
> Sort Key: generate_series
> Sort Method: external merge Disk: 1352kB
> Buffers: local hit=393, temp read=249 written=249
> -> Seq Scan on foo (cost=0.00..1336.20 rows=94320 width=4)
> (actual time=0.025..138.754 rows=100000 loops=1)
> Buffers: local hit=393
> Total runtime: 642.874 ms
> (7 rows)
>
> cedric=# set max_temp_files_size to 1900;
> SET
> cedric=# explain (analyze,buffers) select * from foo order by 1 desc ;
> ERROR: aborting due to exceeding max temp files size
> STATEMENT: explain (analyze,buffers) select * from foo order by 1 desc ;
> ERROR: aborting due to exceeding max temp files size
>
> Do you have some testing method I can apply to track that without
> explain (analyze, buffers) before going to low-level monitoring ?
>

We're looking at this...

> Architecture review
> ==============
>
> max_temp_files_size is used for the global space used per backend.
> Based on how work_mem work I expect something like "work_disk" to
> limit per file and maybe a backend_work_disk (and yes maybe a
> backend_work_mem ?!) per backend.
> So I propose to rename the current GUC to something like backend_work_disk.
>

Done - 'work_disk' it is to match 'work_mem'.

> Patch is not large and easy to read.
> I like the idea and it sounds useful.
>

Great! Cheers

Mark

Attachment Content-Type Size
temp-files-v3.patch.gz application/x-gzip 3.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2011-06-01 23:47:06 Re: pgpool versus sequences
Previous Message Florian Pflug 2011-06-01 23:34:12 Re: Another issue with invalid XML values