pairwise alignment results.I need help understanding it pls

Viewing 8 reply threads
  • Author
    • #15541

      Hello Everybody,

      I hope everyone is doing well!

      I had to do a pairwise alignment for my Tomato UBC gene with an Arabidopsis UBC gene.The website I used to the alignment is … otide.html.
      The Fasta sequence for the two genes are as follows:
      Tomato UBC

      ctcttcttcc atttctttca aaattaaagt attgttactc tgctattggc tcaaaacctc
      tgcaatctcc gtctccttca atttcaactc aagcaaatcc acctctttca ctagtttcat
      cactttcaga tcagggtttg gagttgaagg tacggggggc taattgatgg cgtcgaagag
      gatattgaag gagctcaagg atctgcagaa ggatcccccc acatcatgca gtgctggtcc
      agtggcagag gatatgttcc attggcaagc aacaatcatg gggcctaccg atagccctta
      tgctggaggt gtatttttgg tttcaattca tttccctcca gattatcctt ttaagcctcc
      aaaggttgcc ttcagaacta aggttttcca tcccaacatc aacagcaatg gaagtatttg
      tctggatatt cttaaggagc agtggagtcc agcattaacc atatccaagg tcctgctgtc
      catctgctct ctgttgacag acccaaaccc agatgatcct cttgtacctg aaattgctca
      catgtacaag actgacaggg ccaaatacga aaccactgct cgtagctgga ctcagaaata
      tgcaatggga tgatgcgcaa aatgtctcca ggcatgtctg ggactttgta acagcaatgt
      cttatgtgct tggggtgaat gaataaattc cgtgaaagaa cttagttact tcttaatctc
      ccttcatgag ggttgttaag ggaacagctg ttttcaattt gtgaatattt atttgatgac
      tagtaaggga gaaactgcaa tgtaattcta ctttgtttgc cagtt

      Arabidopsis UBC


      The tomato UBC sequence is the complete CDS which I got from NCBI genbank (825bp), and the arabidopsis UBC gene is the full length cDNA which I got from the TAIR website (621bp).

      I tried attaching my results as a text file but the website does not accept files with txt extension. I tried pasting my results in word but it messes up the results, so if you can use the website’s pairwise alignment that I mentioned above it would easier

      In my case, my professor asked me to align my tomato gene with all the arabidopsis UBC genes at DNA level and phylogeny level to figure out which are the closely related ones to it. Once we figure them out, we have to look at the motifs and domains associated with those genes, and any literature pretaining to them to help us find ways to study our gene better. My problem is when looking at the results I don’t understand it, I cannot make sense out of it? What is the explanation of the alignment, what do they mean by similarity and identity? And what can I deduce from this result? Please help me so I can know how to do the rest of my genes and make sense out of my work.

      Thanks to all of you,

    • #106908

      1) if you want to compare several genes and make some phylogeny, it’s better to make multiple alignment, because the phylogeny programs won’t accept 10 pairwise alignments 😉
      2) is really in nt alignment similarity?

    • #108891

      You can do the alignment (pairwise or multiple) using Clustal W. It will also give you the phylogenetic tree (rooted and unrooted) using different algorithms (Neighbour joining and UPGMA). The phylogenetic trees can also be constructed by using Phylip (it use .phy file – can be extracted from Clustal W).

    • #109132

      someone knows how to export the alignment just as it´s showed in the program BioEdit, to the microsoft Word program? All the alignment, with the "dots", when the aminoacids are equal.



    • #109133

      use File/Graphic View and then File/Export as Rich text
      in this way you can set number of residues per line, color etc.

    • #109410

      i tried to use blastn to align the two sequences of you give, but i find the result is two bad, so i don’t think you can get what u want.

      i have a problem: does anybody know the ‘-dust’ option meaning in ncbi-blast-2.2.25+/bin/blastn

    • #109412

      I would expect something related to this

      Something more from BLAST book

      quote BLAST® Help:

      For nucleotide sequence data in FASTA files or BLAST database format, we can generate the mask information files using windowmasker or dustmasker. Windowmasker masks the over-represented sequence data and it can also mask the low complexity sequence data using the built-in dust algorithm (through the -dust option). To mask low-complexity sequences only, we will need to use dustmasker.

      more at part 5

    • #109413

      i find a url ,maybe it can help you

    • #109414

      these are human genes, yasamino is looking for plant genes 😉

Viewing 8 reply threads
  • You must be logged in to reply to this topic.