2. Formatting BEAST Input Files

InfoInfo
Search:    

Back: Part 1 Tutorial Home Next: Part 3

Created by [WWW]Brian Moore

Introduction

BEAST reads input files written in xml—the extended markup language—that is similar to the more familiar html (hyper-text markup language) used in web applications. This may seem to be an odd choice, as it can be a bit intimidating for users not familiar with xml and tends to be quite verbose. On the other hand, this format does afford great flexibility for specifying almost arbitrarily complex analyses. In any case, if we want to use BEAST, we need to gain some familiarity with this file format, which we hope to achieve through the following exercise.

There are 2 main steps to formatting an input file for divergence time estimation using BEAST: generating a base file with BEAUti and then modifying it with a text editor. The latter will be described in the part 3 of the tutorial.

Open the program BEAUti. This is a ‘helper’ application for BEAST that reads the more standard file NEXUS and generates (almost usable) xml files for basic analyses with BEAST.

Step 1: Getting a NEXUS file into BEAUti

From the File menu, select the ‘Import NEXUS’ option, and navigate to the directory that contains the file ‘Platanus_DTE.nex’.

Step 2: The data window

The alignment will be displayed in the Data pane of the BEAUti window. We are estimating divergence times for a species phylogeny, so ensure that dates are specified as ‘years since some time in the past’.

Step 3: Defining groups of taxa

Move to the Taxa pane. This provides options for specifying one or more subsets of species in your data that you simply may wish to name and or enforce constrain to be monophyletic. We will first describe how to designate a set of taxa, which we will call 'clade_1', and how to enforce the monophyly of this clade using BEAUti.

Step 4: Model specification

Move to the Model pane. This provides options for specifying a model of nucleotide substitution, a model for accommodating among-site substitution-rate variation, and a model for accommodating variation in substitution rate across branches. Let’s imagine that we have previously performed a model selection analyses (e.g., by means of AIC criterion implemented in ModelTest), and that this procedure has identified the general time reversible (GTR) model with Gamma distributed rate variation across sites. Furthermore, let’s imagine that we have detected significant substitution-rate variation in this data set (e.g., by means of hLRT implemented in PAUP*), which suggests that these data do not conform to the molecular clock hypothesis. Accordingly, we want to reflect these findings by making the following model specifications:

Step 5: Prior specification

Move to the Priors pane. This provides options for specifying prior probability distributions for the tree topology and all of the other parameters in the nucleotide substitution model and relaxed clock model.

Step 6: Proposal mechanisms

Move to the Operators pane. This provides options for controlling aspects of the proposal mechanisms used to update parameter values during the MCMC sampling, including the magnitude of proposed changes to each of the parameters (the ‘tuning’ values) and the frequency with which attempts will be made to update each of the parameters (the ‘weight’ values). The default tuning and weight proposal values should work fine for our data set. However, ensure that the ‘Auto Optimize’ box is checked, so that the tuning values will be automatically adjusted during the MCMC in order to ensure the efficiency of parameter mixing.

Step 7: MCMC

Move to the MCMC pane. This provides options for controlling aspects of the MCMC sampling used to approximate the joint posterior probability density of model parameters, phylogeny and divergence times.

Step 8: Generate the xml file!

Click the ‘Generate BEAST file’ in the lower right hand corner of the BEAUti window, and save the generated file as ‘Platanus_DTE.xml’. Congratulations, you are half way there!! Now we need to make some manual modifications to our newly generated 'base' XML file, which we will describe in the next part of the tutorial.

This is a Wiki Spot wiki. Wiki Spot is a 501(c)3 non-profit organization that helps communities collaborate via wikis.