dont know much about genetics. for an arts project Im looking for a human genome.
i read that: "Only about 0.1% of the genome is different among individuals, which equates to about 3 million variants (aka mutations) in the average human genome. This means we can make a “diff file” of just the places where any given individual differs from the normal “reference” genome. In practice, this is usually done in a .VCF file format, which in its simplest format looks something like so:
chr20 14370 rs6054257 G A 29 PASS 0|0
Where each line uses ~45 bytes, and you times this by the ~3 million variants in a given genome, and you get a .VCF file size of about 135,000,000 bytes or ~125 megabytes."
is that correct?
do you know a database to download a VCF file or something similar?
Itll be amazing if you could provide a link to a file metioned above!!
Hi. I hope a bioinformatician will give you the answer. What I read is similar to what you have read, 0.1 % difference between two individuals. This is called polymorphism and can be used to identify predisposition to diseases. On average 1000 bases are similar before one base is different. This is only average since there are DNA hypervariable regions that are for instance used by forensic biology to indentify people. Nevetheless I heard also that,( with the exception of true twins ?), there are no individuals with the same DNA sequence and there will never be. I believe that for chimpanzees and human 100 bases are similar before one base is different. Also surprising ! I have always been puzzled by the fact that there is such a similarity of DNA sequence between two humans and so many differences, particularly at the brain level. It probably means that genetics is only a part of what we are and acquisition is also very important (of course many bright brains have said and written this before me !!) and may be due to a restricted number of genes.