Prevalence of gene
- This topic has 8 replies, 6 voices, and was last updated 9 years, 3 months ago by
merv.
-
AuthorPosts
-
-
September 23, 2010 at 10:21 am #13808
marifly81
ParticipantI want to find out how common my gene is within Bacteria. I have started blasting my gene against all sequenced bacterial genomes, but of course this is time-consuming as I manually check every species.. Are there other ways but Blast?
Help would be greatly appreciated! 😀 -
September 27, 2010 at 11:18 am #101510
bioinfo
ParticipantYou can use Whole Genome Alignment tools like MUMmer (Maximal Unique Match, http://www.tigr.org/tigr-scripts/CMR2/webmum/mumplot
BLASTZ (http://bio.cse.psu.edu/)
LAGAN (Limited Area Global Alignment of Nucleotides; http://lagan.stanford.edu/)
PipMaker (http://bio.cse.psu.edu/cgi-bin/pipmaker?basic)
MAVID (http://baboon.math.berkeley.edu/mavid/)
GenomeVista (http://pipeline.lbl.gov/cgi-bin/GenomeVista)
if u wanna some other information then refer this book
Essential
Bioinformatics by
JIN XIONG
Texas A&M University -
March 11, 2011 at 9:38 am #103859
xav2121
ParticipantHi marifly81 !
I’m also interested in determining the prevalence of a given gene (or protein) from sequenced bacterial genomes.
I have tried the links provided by bioinfo with no success. Have you found a way to do this ? -
March 21, 2011 at 11:05 am #104084
JackBean
Participantwhy do you need exact know, whether is it in any bacterial genome?
I think you should be able @ PubMed to sort your results by species or to view some tree
-
March 25, 2011 at 10:26 am #104162
xav2121
ParticipantI still haven’t found a way to get the info I want. I basically want to know if a given gene is present/absent in a species. The output I would like to see would be "this gene is found in 90% of the members of this genera, or taxon, or family". I’ve tried a few things like doing a Blast search in the sequenced genomes in NCBI and get a taxonomy report. The output gives you how many hits were found for a given species but only if there is a hit. So I still don’t know in which species the gene is absent…
-
March 28, 2011 at 7:26 am #104186
JackBean
Participantthe question is, whether the other taxons where not sequenced yet or whether there is really no homologue. I guess you need to narrow your search to only few taxons and try them manually or you can always download all sequences from NCBI and run BLAST at-home, if you have some unused computer 😀
-
June 16, 2011 at 7:23 pm #105299
nfellaby
ParticipantMy method for BLAST’ing large numbers of organisms was just construct a database from the SEED Network, download all the genomes of interest. Save these as a single fasta file. Then to blast against this file the gene of interest. This was all done on BioLinux but runs on perl programming language, I manage to gain large numbers of hits with probability indicators. I still have all the syntax to hand, so get in touch if I can help.
Nick -
June 23, 2011 at 7:06 am #105386
JackBean
Participantyou might be interested in this article
http://www.plosone.org/article/info%3Ad … ne.0020892 -
October 2, 2011 at 9:01 pm #106560
merv
ParticipantAs far as I understand it:
What you are looking at is evolutionarily related genes. The problem then depends entirely upon where you draw your lines. If you play with the BLAST scores, you will get different results. Many of the tools on the net such as BLAST on NCBI will give you the result – if you perform a BLAST (I prefer psi BLAST), and then you set your leveltype your gene sequence into ncbi home page – this gives you the huige amount of refernces. click on Unigene (note the number of links that have already been assigned)
this takes you to page with: SELECTED PROTEIN SIMILARITIES
Comparison of cluster transcripts with RefSeq proteins. The alignments can suggest function of the cluster.
click on the top link that matches your protein, and in the submenu i suggest protein/protein matches, which has done the BLAST for you –
in my example of CD40 i get a list of matches with the organism the sequence came from: as far as "Blink" very simple to then write all of the species down. i get as far as Cricetulus griseus, the chinese hamster, and its Tumor necrosis factor receptor superfamily member 22 . As I know that CD40 is in this family I can trust it. If I get a bacterial sequence, I know I cant- such a thing occurs when you dont want to use Blink but try to find previously unidentified connections- then you enter a grey area occupied by people who don’t like being asked to nail jelly to plates but admit that some people are better at it than others.
If you are not happy with the Blink data and want to challenge it, you can do your own BLAST – click the same link in Unigene, go to the protein, select BLAST, and then choose (in this example) the PSI-BLAST option (the default BLAST is also good), leave everything else the same – then do the alignment. Once you are familar with that, you need to consider how many matches you want to ask for, how much of the servers time you use is therefore worth bearing in mind. You can adjust the choice of matrix (I am not sure of the differences, but they will of course give you different answers – ) and you can adjust the stringency using the GAP-penalties section. You can generate a lifetimes work on one gene alone with the different options- it helps to have a guide if you do this work. PSI and PHI BLASTS allow greater analysis based on repeated iterations based on the previous analysis (including the new data each time, as I understand it). The threshold {EXPECT} is a means of lowering the threshold if you arent getting any results – now we are in 2011, this is rarely required as we have so much sequence data and most genes have homologs and in effect have been sequenced now in most organisms. In my example, I have just found a fish match to the CD40 gene – most interesting ! From the data I have discussed so far this would not be expected, but we know that CD40 exists in fish. So the question is, are you happy with what Blink gives you (humans and chines hamsters are related ) or would you have wanted to get as far as fish in the analysis? I expect you have a good understanding of bacteria- i dont hence my discussion of a mammalian gene but the similarities of approach apply still. If you want to limit the size of your study, just raise the EXPECT.
-
-
AuthorPosts
- You must be logged in to reply to this topic.