Run Constant Age or Size Bootstrapping — curv

This function is used bootstrap the curvature - age or curvature - size analyses while holding the clade size or clade age constant, respectively. This functions requires the packages 'ape', 'ggplot2', and 'wesanderson.' In addition, the packages required for the 'caribmacro' functions com_matrix(), SR_geo(), and sr_LM() are also required for this function.

curv_boot(
  occurrences,
  real.dat,
  node.mat,
  tree.gen,
  lineage,
  clade,
  species_names,
  con_age = FALSE,
  con_size = FALSE,
  sp.num = NULL,
  age = NULL,
  runs = 1000,
  keep.gen.sp = TRUE,
  make_plot = FALSE,
  just.clades = FALSE,
  bank.data = NULL,
  status = NULL,
  Area = NULL,
  geo_group = NULL,
  stat_levels = NULL
)

Arguments

occurrences: A data frame that holds the species occurrence data with at least 4 columns of factors that holds the geographic groups, taxonomic groups, species statuses, and species binomial names.
real.dat: A data frame that has the observed curvatures for each phylogenetic clade. The columns should be 'Clade' for the clade names, 'Age' for the clade ages, 'SR' for the clade sizes, 'ABC.t' for the curvature of the total assemblage, and 'ABC.n' for the curvature of the native assemblage.
node.mat: A data frame that has the presence/absence of each phylogenetic clade. The first column of this data frame should hold the clade names and be named 'Clade'
tree.gen: A phylogenetic tree with genera as the tips.
lineage: A vector of characters that equal the clade names of the evolutionary lineage of interest.
clade: A character equal to the name of the smallest clade in the lineage.
species_names: A character equal to the column name in 'occurrences' in which the species binomial names are stored
con_age: Logical. If true constant age bootstrapping is performed. Default is con_age = FALSE.
con_size: Logical. If true constant size bootstrapping is performed. Default is con_size = FALSE.
sp.num: Numeric. If con_age = TRUE, 'sp.num' should be a single numeric equal to the clade size to be held constant for the constant age bootstrapping.
If con_age = TRUE, 'sp.num' should be a vector of varying clade sizes (e.g., seq(<size of smallest clade>, <size of clade at age>, (<size of clade at age> - <size of smallest clade>)/5).
Default is sp.num = NULL.
age: Numeric. The age of a clade along the lineage that will be held constant for the constant age bootstrapping. Required if con_age = TRUE. Default is age = NULL.
runs: Numeric. The number of random clades to be made for each age or clade size value. Default is runs = 1000.
keep.gen.sp: Logical. If true a species records that are only identified to their genus are retained in the species-area curve analysis. Default is keep.gen.sp = TRUE
make_plot: Logical. If true a list is returned with the ggplot object saved in the third element named 'Plot'. Default is make_plot = FALSE
just.clades: Logical. If true a just a data frame is returned with the presence/absence of species in each of the randomly made clades. Default is just.clades = FALSE.
bank.data: A data frame that holds the explanatory variables for each geographic feature of interest, specifically their area. Default is bank.data = NULL.
status: A character equal to the column name in 'occurrences' in which the species' statuses are stored. Default is status = NULL.
Area: A character equal to the column name in 'bank.data' in which the area of the geographic feature is stored. Default is Area = NULL.
geo_group: A character equal to the column name in 'bank.data' _AND_ 'occurrences' in which the geographic feature names are stored. Default is geo_group = NULL.
stat_levels: A character or vector equal to the levels of the interest of the species' status column in 'occurrences'. The default is to use all of the levels in the species' status column of the data. Default is stat_levels = NULL.

Value

If just.clades = TRUE, a data frame with the with the presence/absence of species in each of the randomly made clades. If just.clades = TRUE, a list of length 2 if make_plot = FALSE or length 3 if make_plot = TRUE. The elements of this list are 'Matrix' (the presence/absence of species in each clade), 'Curvature' (a data frame with the curvature of the species-area curves for each clade and the clade ages and sizes), and if make_plot = TRUE, 'Plot' (a ggplot oblect of the results).

Examples

if (FALSE) {
if (FALSE) {

node.dat <- read.csv(file.path(here(), 'data_out', 'results', 'sar_lin', 'supp_info', 'Node_ABC-Age_Data.csv'), header = TRUE)
nod.mat <- read.csv(file.path(here(), 'data_out', 'supp_info', 'IBT_Node_Clade_Data.csv'), header = TRUE)
tree.gen <- read.tree(file.path(here(), 'data_raw', 'trees', 'Tetrapoda_genus.nwk'))
herp <- read.csv(file.path(here::here(), 'data_raw', 'IBT_Herp_Records_final.csv'), header=TRUE)

herp$stat_new <- NA
for (i in 1:nrow(herp)) {
  if (herp[i, 'bnk_status'] == 'E') {
    herp[i, 'stat_new'] <- 'E'
  } else if (herp[i, 'bnk_status'] == 'FE') {
    herp[i, 'stat_new'] <- 'N'
  } else if (herp[i, 'bnk_status'] == 'N') {
    herp[i, 'stat_new'] <- 'N'
  } else if (herp[i, 'bnk_status'] == 'PX') {
    herp[i, 'stat_new'] <- 'N'
  } else if (herp[i, 'bnk_status'] == 'U') {
    herp[i, 'stat_new'] <- 'E'
  } else {
    herp[i, 'stat_new'] <- 'X'
  }
}

anol <- node.dat[which(node.dat$Lineage == clade), ]
lin <- anol$Clade[order(anol$Clade, decreasing = TRUE)]

res <- curv_boot(occurrences = herp,
                 real.dat = node.dat,
                 node.mat = nod.mat,
                 tree.gen = tree.gen,
                 lineage = lin,
                 clade = 'Anolis',
                 species_names = 'binomial',
                 con_age = TRUE,
                 sp.num = seq(240, 600, 40),
                 age = 201,
                 runs = 1000,
                 make_plot = TRUE,
                 bank.data = bank_dat,
                 status = 'stat_new',
                 Area = 'Area',
                 geo_group = 'bank',
                 stat_levels = c("N", "E"))

}
}