Blog Archives

Simple DNA saturation plots in R

4/1/2015

I'm currently working with a Sanger-generated 10-gene dataset, which includes a few fast-evolving genes. During the initial data exploration phase I used saturation plots to check for potential decay of phylogenetic signal caused by multiple substitutions (saturated nucleotide variation).

There are a lot of ways to visualize saturation. One method, which I believe was originally used by Philippe et al. (1994), is to plot the raw or uncorrected pairwise genetic distances in an alignment against model-corrected genetic distances. If the relationship is approximately linear, then the gene is not saturated; if the line curves or plateaus, there is evidence of saturation.

Here is an example of an unsaturated gene:

And an example of a saturated gene:

This is a really rough method, which should probably only be used as a preliminary exploration of your data. As far as I know, there is not an established slope value that says definitively, "yes, this gene is saturated." However, I do think it's a useful thing to look at, and it's really easy to do in R. You may want to look into APE's dist.dna command for all of the available models. Here is the R-code I used to make these simple plots:

library(ape)

###Input data: a phylip-format alignment file, converted to a 'DNAbin' object###
dat<-read.dna(file="myData.phy", format = "sequential", as.character=TRUE, skip=0)
dat<-as.DNAbin(dat)

###Convert to genetic distances###
dist<-dist.dna(dat, model="raw")
dist.corrected<-dist.dna(dat, model="TN93")

###Make plot###
plot(dist~dist.corrected, pch=20, col="red", xlab="TrN model distance", ylab="Uncorrected genetic distance", main="Saturation Plot")
abline(0,1, lty=2)
abline(lm(dist~dist.corrected), lwd=3)
lm_coef<-coef(lm(dist~dist.corrected))
text(0.1,0.05,bquote(y == .(lm_coef[2])*x))

My NSF Graduate research Fellowship Proposal

4/1/2015

1 Comment

Another round of NSF Graduate Research Fellowships was announced yesterday and the handful of grad students who received awards (~12% of applicants) are probably still on cloud 9. If that's you, congratulations!

I applied for the Fellowship in 2013 and 2014. I received Honorable Mention my first year, made some changes to my proposal, then reapplied and was funded the following year. I thought I would share my judges comments and successful research proposal here for anyone thinking of applying or reapplying for the 2016 competition. Good luck!

2013 Judges Comments (Not Successful)
2014 Judges Comments (Successful)
2014 Research Proposal (Successful)

1 Comment

Simple DNA saturation plots in R

My NSF Graduate research Fellowship Proposal

Author

Archives

Categories