I’m performing a computational analysis on mutations and currently trying to build a data set of specific mutation that I found in papers.
The first group of mutations is defined by: (1) Gene name (HGNC annotations, such as "LDLR" or "BRCA1") and (2) the cDNA (c.) nomenclature (such as c.2389 G>T or c.313+6T>C).
The second group is fedined by (1) Gene name (again HGNC annotations), (2) such as mutation (G>A), (3) intron number and (4) position relative to the intron (such as +6 or -2).
I wish to locate the genomic coordinate of the mutation (hg19 coordinates).
In both cases, I encounter here a conflict as there are, in many cases, a few transcripts, (i.e. more than one cDNA for the gene). In both papers, they didn’t supply a transcript id, only gene name. 😐
Is there a common way to interpret these mutations? (for example, looking only on the longest transcript). In addition, assuming I have only a single transcript, which tables do you suggest to use in order to fetch the exons\introns positions within the transcript\cDNA?