Re: Storing number '001' ?

From: "Josh Berkus" <josh(at)agliodbs(dot)com>
To: Charles Hauser <chauser(at)acpub(dot)duke(dot)edu>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Storing number '001' ?
Date: 2001-12-14 19:13:41
Message-ID: web-529002@davinci.ethosmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Chuck,

> >1. Are "Fasta" and "Sequence" the same thing? (further questions
> assume
> >this to be the case).
>
> For our purposes, yes. Strictly speaking however, fasta denotes a
> particular format the sequence is in:
>
> >894001A01.x1
> GATCGATCGCTACGTCAGAC
>
> is fasta formatted sequence, whereas:
>
> GATCGATCGCTACGTCAGAC
>
> is sequence.
>
> In TABLE clone_fasta.seq I store the latter, ie
> 'GATCGATCGCTACGTCAGAC'. If it helps, I can name TABLE clone_fasta
> TABLE clone_sequence?

No, don't rename your tables for my convenience. I was just confused
because you were using the word "fasta" in some places and "sequence" in
others.

> >2. We established in the last e-mail that there is potentially more
> than
> >one clone related to each clone_fasta and to each clone_qual record,
> and
> >that no clone has more than one fasta or qual record. Is that still
> >true?
>
> I believe not. Let me answere this with an example, and go from
> there.
> I will simplify the clone id and not break it down into 6 fields.
>
>
> TABLE clone TABLE clone_fasta
> TABLE clone_qual
> clone seq
> qual
> record 1: 894001A01.x1 <------> GATCGATATATA.....
> <------> {9 9 9 23 34 45 ...}
>
> record 2: 894001A01.y1 <------> TTTTTTGATGAT.....
> <------> {3 4 6 9 14 34 21 ...}
>
> record 3: 894001A02.x1 <------> GTTTCACTAGCT.....
> <------> {8 5 15 31 24 7 ...}
>
>
> From the above example, which is universally true, I would state
> that:
>
> 1. one and only one clone relates to each clone_fasta and to each
> clone_qual.
> 2. no clone has more than one fasta or qual record.
>
> Stated another way, each clone has one and only one fasta(sequence),
> and one and only one qual.

So, two more questions:
1. Does more than one clone potentially relate to each fasta? Do we
care? (i.e. will we ever query the database for "Which clones relate to
sequence x?" Or do we receive data like "CLones 1999, 2001, and 2173
have sequence x)?")
2. Does clone_qual properly relate to clones, or to clone_fasta? From
your description, I can see things going either way.
3. The contigs: is a contig assembled out of clones, sequences (fasta)
or quals? I'm not clear on this. From your description, it seems like
a contig might actually represent 1-2 sequences(fastas or qual?), as
opposed to 1-2 clones.

> >3. In what order does the data arrive for your tables? I.e., is
> this an
> >accurate order of events:
> >(1) Clone data
> >(2) Sequence (Fasta?) data
> >(3) Qual data
> >(4) Contig data
> >(5) Library and Genebank data.
> >Is this accurate?
>
>
> More accurate to order them as:
>
> (5a) Library : is updated with each new project
> (1),(2),(3) : arrive simultaneously, but would be entered in the
> order you listed.
> (5b) Genbank : data submitted to Genbank after (1,2,3) are in hand
> (4)
> (6) blast

We'll hash this out eventually! I think we're still struggling with you
speaking biologist and me speaking DBA ...

-Josh

______AGLIO DATABASE SOLUTIONS___________________________
Josh Berkus
Complete information technology josh(at)agliodbs(dot)com
and data management solutions (415) 565-7293
for law firms, small businesses fax 621-2533
and non-profit organizations. San Francisco

Browse pgsql-novice by date

  From Date Subject
Next Message Brian Avis 2001-12-14 23:37:05 Vacuum
Previous Message Stephen Ingram 2001-12-14 18:53:04 Re: A question about constraints.