Quick Links

Re: sum of all values

From:	Richard Huxton <dev(at)archonet(dot)com>
To:	Madison Kelly <linux(at)alteeve(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: sum of all values
Date:	2005-01-14 18:34:14
Message-ID:	41E810A6.7040101@archonet.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Madison Kelly wrote:
> Richard Huxton wrote:
>
>> Madison Kelly wrote:
>>
>>> Hi all,
>>>
>>> Is there a fast(er) way to get the sum of all integer values for a
>>> certain condition over many thousands of rows? What I am currently
>>> doing is this (which takes ~5-10sec.):
>>
>> OK, I'm assuming you've configured PG to your satisfaction and this is
>> the only query giving you problems.
>
> This is a program for general consumption (hopefully... eventually...)
> so I want to leave the psql config alone. Once I am happier with the
> program I will try different tuning options and write a faq though I
> expect 9 out of 10 users won't read it.

PostgreSQL is not FireFox, and you can't expect it to work efficiently
without doing at least some configuration. The settings to support 100
simultaneous connections on a dual-Opteron with 8GB RAM are not the same
as on a single-user laptop.
Take half an hour to read through the performance-tuning guide here:
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

>>> SELECT SUM (a.file_size) FROM file_info_1 a, file_set_1 b WHERE
>>> a.file_name=b.fs_name AND a.file_parent_dir=b.fs_parent_dir AND
>>> a.file_type=b.fs_type AND b.fs_backup='t';
>>
>> You'll want to run EXPLAIN ANALYSE SELECT SUM... and post the output
>> of that, although the query looks straightforward enough.
>
> tle-bu=> EXPLAIN ANALYZE SELECT SUM (a.file_size) FROM file_info_1 a,
> file_set_1 b WHERE a.file_name=b.fs_name AND
> a.file_parent_dir=b.fs_parent_dir AND a.file_type=b.fs_type AND
> b.fs_backup='t';
>
> QUERY PLAN
> ----------------------------------------------------------------
> Aggregate (cost=2202.54..2202.54 rows=1 width=8) (actual
> time=5078.744..5078.748 rows=1 loops=1)
> -> Merge Join (cost=724.94..2202.51 rows=11 width=8) (actual
> time=3281.677..4969.719 rows=12828 loops=1)
> Merge Cond: (("outer".file_parent_dir = "inner".fs_parent_dir)
> AND ("outer".file_name = "inner".fs_name) AND ("outer".file_type =
> "inner".fs_type))
> -> Index Scan using file_info_1_search_idx on file_info_1 a
> (cost=0.00..1317.11 rows=12828 width=104) (actual time=0.042..116.825
> rows=12828 loops=1)
> -> Sort (cost=724.94..740.97 rows=6414 width=96) (actual
> time=3281.516..3350.640 rows=12828 loops=1)
> Sort Key: b.fs_parent_dir, b.fs_name, b.fs_type
> -> Seq Scan on file_set_1 b (cost=0.00..319.35
> rows=6414 width=96) (actual time=0.029..129.129 rows=12828 loops=1)
> Filter: (fs_backup = true)
> Total runtime: 5080.729 ms

Well, it's slow, but that's probably your settings. Run VACUUM ANALYSE
on the tables though, it looks like you've got default statistics (It's
expecting exactly 1/2 the fs_backup values to be true - 6414 out of 12828).

>>> Here are the schemas, in case they help:
>>>
>>> tle-bu=> \d file_info_1 Table "public.file_info_1"
>>> Column | Type | Modifiers
>>> -----------------+---------+----------------------------
>>> file_acc_time | bigint | not null
>>> file_group_name | text | not null
>>> file_group_uid | integer | not null
>>> file_mod_time | bigint | not null
>>> file_name | text | not null
>>> file_parent_dir | text | not null
>>> file_perm | text | not null
>>> file_size | bigint | not null
>>> file_type | text | not null default 'f'::text
>>> file_user_name | text | not null
>>> file_user_uid | integer | not null
>>> Indexes:
>>> "file_info_1_display_idx" btree (file_parent_dir, file_name,
>>> file_type)
>>> "file_info_1_search_idx" btree (file_parent_dir, file_name,
>>> file_type)
>>>
>>> tle-bu=> \d file_set_1 Table "public.file_set_1"
>>> Column | Type | Modifiers
>>> ---------------+---------+----------------------------
>>> fs_backup | boolean | not null default true
>>> fs_display | boolean | not null default false
>>> fs_name | text | not null
>>> fs_parent_dir | text | not null
>>> fs_restore | boolean | not null default false
>>> fs_type | text | not null default 'f'::text
>>> Indexes:
>>> "file_set_1_sync_idx" btree (fs_parent_dir, fs_name, fs_type)
>>
>>
>>
>> 1. WHERE ARE YOUR PRIMARY KEYS???
>> 2. Why do you have two identical indexes on file_info_1
>> 3. WHERE ARE YOUR PRIMARY KEYS???
>> 4. Am I right in thinking that always, file_name==fs_name (i.e. they
>> represent the same piece of information) and if so, why are you
>> storing it twice? Same for _parent_dir too
>> 5. file_type/fs_type are being held as unbounded text? Not an index
>> into some lookup table or a varchar(N)?
>>
>> Can you explain what you're trying to do here - it might be you want
>> to alter your database design.
>> --
>> Richard Huxton
>> Archonet Ltd
>
> This is where I have to admit my novice level of knowledge. Until now
> I have been more concerned with "making it work". It is only now that I
> have finished (more or less) the program that I have started going back
> and trying to find ways to speed it up. I have not used postgres (or
> perl or anything) before this program. I hope my questions aren't too
> basic. ^.^;

There's a rule of thumb about throwing the first version of anything
away. This could well be the time to apply that. I'd recommend getting
book, "An Introduction to Database Systems" by "C.J.Date". It's not an
SQL or "Learn X in 24 hours" but there's plenty of those about and
you've managed to pick up SQL/Perl already. It will explain relational
theory and why it's useful to you.

> I keep hearing about Primary Keys but I can't say that I know what
> they are or how they are used. If I do understand, it is a way to
> reference another table's entry (using a foreign key)? The two matching
> indexes is a typo in my program that I hadn't noticed, I'll fix that asap.

OK - here are a few rules-of-thumb you might find useful until you've
read the book.

1. Every piece of information should be represented explicitly.
If there is an order for your data, it should be based on values already
present, or introduce an explicit "sort_order" column.
2. Every piece of information (row) should be uniquely identifiable. The
column value(s) that uniquely identify a row are known as a key. If
there are several keys pick one - that is your "primary key".
3. Every non-key column in a table should depend on the key and nothing
but the key.
4. Avoid repeating information - you can do this by following points 2,3.
5. Avoid inconsistencies - again 2,3 will help here.

Looking at file_info_1, you have no primary key. This means you can have
two rows with the same (file_parent_dir, file_name) - probably not what
you want. Since these uniquely identify a file on a partition (afaik),
you could make them your primary key.

Also, you have two columns user_uid, user_name. If user_uid is the
file's owner and user_name is their name then user_name doesn't depend
on the primary key, but on user_uid. If one file has uid=123 and
name="Fred" then *all* files with uid=123 will have an owner with name
"Fred".

So - this goes into a separate table:
CREATE TABLE user_details (
uid int4 NOT NULL UNIQUE,
name text,
PRIMARY KEY (uid)
);

Then, in file_info_1 you remove user_name, and make user_uid reference
user_details.uid so that you can't enter an invalid user number. If you
need the name, just join the two tables on user_uid=uid. The term
"foreign key" is used because you're referencing the key of a "foreign"
table.

> Here is what the database is used for:
>
> This is a backup program and I use the DB to store extended
> information on all selected files and directories on a partition. Each
> partition has it's own 'file_info_#' and 'file_set_#' tables where '#'
> matches the ID stored for that partition in the DB in another table.
>
> The 'file_info_#' table stored the data that can change such as file
> size, last modified/accessed, owing user and group and so forth. The
> 'file_set_#' table stores the flags that say to include or exclude it
> from a backup/restore job and whether it has been selected for display
> in the file browser.

I don't see how you'd flag a whole directory for backup with what you've
got, but maybe I'm missing something.
I'd separate the information into three tables:
file_core (id, path, name)
file_details (id, size, last_mod, etc)
file_backup (id, backup_flag, display_flag, etc)
Define file_core.id as a SERIAL (auto-generated number) and make it the
primary key. Define a unique constraint on file_core.(path,name). This
lets you have a simple number referencing file_core from the other two
tables.
Now, if I file gets updated you only alter file_details, and if the user
decides to flag more/less files then you only change file_backup.

> In the first iteration I -used- to have the data in a single table and
> I identified the partition with a column called 'file_in_id' (or
> something similar). As I looked at each file on the system I would do a
> db call to see if the entry existed and if so, update it and if not,
> insert it. This was horribly slow though so I decided to break out into
> the schema above.

Probably the wrong choice. Keep your design clean and simple for as long
as you can, only mangle it once you know you've hit the limitations of
the database server. It might be you hit that, but since you haven't
done any tuning, probably not.

> With the schema above what I do now is just drop the 'file_info_#'
> table, recreate the table and matching indexes and then do a mass 'COPY'
> of all the file info on the partition. After this is done I read in the
> new data from the reloaded 'file_info_#' table and sync the data in
> 'file_set_#' which removes entries no longer in 'file_info_#', adds new
> ones matching the parent's values and leaves the existing entries alone.
>
> I found droping the table and re-creating it a lot faster than a
> 'DELETE FROM' call and it also seems to have made 'VACUUM FULL' a lot
> faster.

The VACUUM FULL is faster because it's not doing anything - the new data
is in a brand new table. Make sure you ANALYSE the new table though.

> Thank you very much for your feedback! I hope I haven't done something
> -too- foolish. :p If I have, I will change it.

No foolishness, just inexperience. Go forth and get some books that
cover relational theory. A day spent on the principles will save you a
week of work later.

Good Luck!
--
Richard Huxton
Archonet Ltd

In response to

Re: sum of all values at 2005-01-14 15:37:38 from Madison Kelly

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Greg Stark	2005-01-14 18:47:34	Re: which dual-CPU hardware/OS is fastest for PostgreSQL?
Previous Message	Adrian Holovaty	2005-01-14 18:32:12	Index on a function and SELECT DISTINCT