Monday, August 18, 2014

Journal Club

Date                Name
04-Aug-14 Kerry
11-Aug-14 Masethabela
18-Aug-14 Connor
25-Aug-14 Bronwynne
01-Sep-14 Samantha
08-Sep-14 Michael
15-Sep-14 Arrie
22-Sep-14 Catherine
29-Sep-14 Sarita
06-Oct-14 Stephen
13-Oct-14 Paulette
20-Oct-14 Anton
27-Oct-14 Tessa
03-Nov-14 Carel
10-Nov-14 Ilkser
17-Nov-14 Kerry
24-Nov-14 Masethabela
01-Dec-14 Connor
08-Dec-14 Bronwynne

Wednesday, June 11, 2014

Publish your valuable scientific data

"Scientific Data" is a new journal from Nature Publishing Group that publishes genetic data that are valuable for the scientific community. Since the data is peer-reviewed, your data published there will get a larger scientific credit and will reach a broader community. Besides, it will be easier to reuse previously published data.

From the website of the journal (http://www.nature.com/sdata/): "Scientific Data is an open-access, peer-reviewed publication for descriptions of scientifically valuable datasets. Our primary article-type, the Data Descriptor, is designed to make your data more discoverable, interpretable and reusable."

Friday, May 16, 2014

A Road Map for Population Genomics


In Monday's discussion of Andrew et al. (2013) Road Map for Molecular Ecology we criticized this vision of the future. In particular the sections on phylogeography, hybridization and speciation, don't grapple with the essential problem of how genome data are changing our questions and analyses.

Here are some recent papers that do just that.

  Ellegren (2012) "Genome sequencing and population genomics in non-model organisms" reviews methods for inference of historical demography (PSMC). This is a genome wide analogue of skyline plots, based on varying time to coalescence of different genome segments. It was developed for a single diploid genome (you need some heterozygosity to estimate coalescence time) and has now been extended to multiple individuals. It should be better resolved than one-to-few-gene skyline plots, especially in detecting the signal of multiple expansion-contractions but not necessarily for recent events (which depend on sample size).
Sousa & Hey (2013)  "Understanding the origin of species with genome-scale data: modelling gene flow" go straight to the heart of the matter. Much of the potential power of population genomics comes from recombination. The coalescent genealogical sampling methods that we currently use (BEAST, IMa...) are computationally intensive and can't handle recombination. Recent population genomic studies use summary statistics like Fst and D (single-SNP 4-taxon gene trees). These throw out most information in the data and can't distinguish among processes that may have caused observed patterns. Current methods are simply unable to simultaneously consider implications of many gene-genealogies with varying degrees of linkage but there are some promising approaches that involve computational shortcuts (ABC, PAC) and Hidden Markov Model (HMM) methods that allow changing genealogies along the genome.

Both of these reviews have something to say about detecting selection and the genes underlying phenotypic differences in wild populations - including GWAS, which was not mentioned in Andrews et al. (2013).

Apart from the emerging methods reviewed above I think our road map needs to consider genome-wide inference of admixture. This year there have been some great new resources and approaches to studying individual origins and population histories using admixture. One of these is the Genetic Atlas of Human Admixture History (Hellenthal et al. 2014), with this wonderful web playground. Just a few weeks ago Elhaik et al. (2014..aka The Genographic Consortium) published the "Geographic Population Structure" algorithm (GPS...haha) which uses admixture analysis of Ancenstral Informative Markers (AIMs) to infer the biogeographic origins of populations. Using 100 000 SNPs they were able to assign human individuals to geographic regions with astounding accuracy, to within 50km and often to the right village for some European populations (Sardinia, which is old and structured). This is far better than what is achieved from previous methods such as PCA.

On the other end of the population structure spectrum, we need to know about the new Bayesian assignment methods for species delimitation (BPP), which may be particularly applicable to bar-coding studies. As a parting shot I might add that DNA based trophic ecology (i.e.  CSI - POO) is at best a scenic byway off the freeway of ecological metagenomics; with the as yet unfulfilled promise of detecting rare species, measuring abundance and analysing whole species communities from non-invasive environmental samples. These curious absences in Andrew et al. (2013) are partly because the review aims at concepts not methods. Given that our (potential) ability to generate data is way ahead of theory we need more specific consideration of analyses. 

The Harrison et al. (2014) review of evolutionary potential, which we didn't discuss, is an interesting essay on how we might use experimental analyses of fitness in model organisms to derive measures of adaptive and non-adaptive variation in the wild (giving evolutionary significance to ESUs). Unfortunately the only two organisms cited in their consideration of wild populations are Arabidopsis and Drosophila - I guess we are not quite there yet.

PSMC - Pairwise Sequential Markovian Models
HMM - Hidden Markov Models
Admixture
AIMs - ancestry informative markers
GPS - population structure
BPP - species assignment
Ecological Metagenomics
GWAS on non-model species

That's a long list of emerging techniques to leave off your road map of the future.

Tuesday, March 25, 2014

Lineage identification - Comparison of methods



Dear Meepers,

Find below the tree that we can use to delimit the lineages using the different methods available.

The phylogenetic tree represents different lineages of the mouse Micaelamys namaquensis. You can find the full article following the link http://www.biomedcentral.com/1471-2148/10/307.


Friday, February 28, 2014

Date Name
17-Feb-14 Connor
24-Feb-14 Kerry
03-Mar-14 Anton
10-Mar-14 Tessa
17-Mar-14 Carel
24-Mar-14 Michael
31-Mar-14 Samantha
07-Apr-14 Ilkser
14-Apr-14 Public holiday
21-Apr-14 Public holiday
28-Apr-14 Paulette
05-May-14 Arrie
12-May-14 Catherine
19-May-14 Masethabela
26-May-14 Sarita
02-Jun-14 Stephen
09-Jun-14 Carel
16-Jun-14 Public holiday
23-Jun-14 Connor
30-Jun-14 Anton
07-Jul-14 Amanda
14-Jul-14 Tessa
21-Jul-14 Masethabela
28-Jul-14 Ilkser