GENOME
During early 90's, scientists used to define the 'genome'as complete set of genes of an organism. Later with the discovery of non-coding, repetitive and transposable elements, the word start to loose its original description. Now the content of the genome in eukaryotic cell system can be described under three terms viz. 'genome', 'transcriptome' and 'proteome'. Though the original definition for 'genome' still holds good for prokaryotic cell systems, while applying the same term for eukarotic cell system, we use in a more general sense as 'the complete sequence of DNA of an organism'. One can not determine the full content of the genome by considering either total mRNA content of the cell or by considering the polypeptide chain of the protein. The reason is, all the genes may not be expressed in all the times uniformly in all the cell types within the body; all the RN A's may not be used for protein synthesis, for example in Mouse having the genome size 2.5 billion base pairs, about 60% is used for RNA synthesis and out of it less than 5% is used in protein synthesis; and another important reason is that during co and post translational process, intact protein is trimmed or spliced into smaller fragments. By realising these facts, the term 'genome' is used in a more general sense as 'complete sequence of DNA of an individual'. The term 'TRANSCRIPTOME' is used to describe 'the complete set of gene/RNA expressed in one particular time and cell system'. It includes the sum of all types of RNA's like mRNA, tRNA, rRNA, snurp's(small nuclear ribonucleo protein particles) and scyrp's(small cytoplasmic ribonucleo protein particles) found in one particular time and cell system, and even it can also extend to the level of organism. On the other hand 'proteome' describe the complete set of polypeptides found in a cell. Though, the proteome value is alwaysless than the transcriptome, there should be a close correlation between the two values in any particular conditions.
The study of DNA reassociation kinetics reveals that, the eukaryotic cell system is made up of three types of sequences, they are
1. Highly repetitive
2. Moderately repetitive and
3. Non-repetitive
Non of the sequences of category 1 and 2 codes for protein and completely absent in prokaryotic cell system. Hence, prokarotic genomes are said to be as more complex than eukaryotic cell system.
DNA reassociation kinetics: When denatured DNA allowed to renature, it proceeds with three different pace i.e. highly repetive sequences rewind very fast(fast component), followed by moderately repetitive sequences(intermediate component) and finaly the unique sequences (Slow component) as mentioned in the figure. It can be determined by using the formula
C/C0 = 1/(1+kC0t)........................................................................................equation 1
for half completion, equ. 1 becomes
C/C0 = 1/2 = 1/(1+kC0t1/2)
so that
C0t1/2 = 1/k
Since, C0t1/2 is the product of concentration and time required to proceed hlaf way, a greater C0t implies a slower reaction expressed in nucleotide-moles x seconds/litre.
If the number of sequence present in fewer copies, then the reaction becomes too slow, C0t value becomes very high and genome can be said to be as more complex and vice versa.
Genome complexity can be easily calculated by using the formula:
Cot 1/2(DNA of any genome)/Cot1/2(E.coli genome) = (complexity of any genome)/4200000bp
Where, genome size of E. coli is 4200000bp and its Cot1/2 value is '9' is used as a standard as it contains 100% unique sequences.
therefore,
Complexity of the genome (x) = (C0t1/2 of 'x' X 4200000 )/9
One can also calculate the complexity of other genomes by using the C0t1/2 values from the table above.
References:
1. Benjamin Lewin, GENE V and X
2. http://mol-biol4masters.org/DNA_C_Value_Paradox2-Hybridization_Kinetics_files/image007.jpg
3. http://www.ucl.ac.uk/~ucbhjow/b241/images/cot_curves.jpg