Re: Idle git question: how come so many "objects"?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Idle git question: how come so many "objects"?
Date: 2010-12-01 20:22:17
Message-ID: AANLkTinB4YVqKKt99O5HvjkHmH0OcjOt6U=Q7UYr8NML@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 1, 2010 at 2:08 AM, Martijn van Oosterhout
<kleptog(at)svana(dot)org> wrote:
> On Wed, Dec 01, 2010 at 01:03:26AM -0500, Tom Lane wrote:
>> So I just made a commit that touched four files in all six active
>> branches, and I see:
>>
>> $ git push
>> Counting objects: 172, done.
>> Compressing objects: 100% (89/89), done.
>> Writing objects: 100% (89/89), 17.07 KiB, done.
>> Total 89 (delta 80), reused 0 (delta 0)
>> To ssh://git(at)gitmaster(dot)postgresql(dot)org/postgresql.git
>>    35a3def..8a6eb2e  REL8_1_STABLE -> REL8_1_STABLE
>>    cfb6ac6..b0e2092  REL8_2_STABLE -> REL8_2_STABLE
>>    301a822..0d45e8c  REL8_3_STABLE -> REL8_3_STABLE
>>    61f8618..6bd3753  REL8_4_STABLE -> REL8_4_STABLE
>>    09425f8..0a85bb2  REL9_0_STABLE -> REL9_0_STABLE
>>    c0b5fac..225f0aa  master -> master
>>
>> Now I realize that in addition to the four files there's a "tree" object
>> and a "commit" object, but that still only adds up to 36 objects that
>> should be created in this transaction.  How does it get to 172?  And
>> then where do the 89 and 80 numbers come from?
>
> IIRC, each directory also counts as an object. So if you change a file
> in a/b/c/d you get 5 commit objects, one for the file and four for the
> directories.

No, not 5 commit objects - 5 trees (for the directories, including the
root directory), 1 blob (for the file), and 1 commit. From gIt(1):

The object database contains objects of three main types: blobs, which
hold file data; trees, which point to blobs and other trees to build up
directory hierarchies; and commits, which each reference a single tree
and some number of parent commits.

So in Tom's case I think we can account for the root directory, src,
src/backend, src/backend/executor, src/backend/optimizer,
src/backend/optimizer/util, src/test, src/test/regress,
src/test/regress/expected, src/test/regress/sql, the 4 files actually
updated, and the commit - 15 objects per branch * 6 branches = 90
objects. I'm not sure why the actual number is 89, unless perhaps two
of the post-commit regression test files were byte-for-byte identical
and got collapsed into a single object. I believe that "delta" refers
to the number of those objects that are stored as deltas against an
existing object (in essence, diffs) rather than as completely new
copies.

I have no idea where the 172 number comes from.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2010-12-01 20:31:27 Re: crash-safe visibility map, take three
Previous Message Josh Berkus 2010-12-01 20:13:35 Re: Another proposal for table synonyms