Import

Importing data is done via the phyloseq package. MicrobiomeR was developed by testing data from Nephele’s 16S Qiime pipeline, which will influences the workflow for these vignettes. While one can use any phyloseq object generated by phyloseq for MicrobiomeR, we suggest using the function, MicrobiomeR::create_phyloseq(), and we will demonstrate using it to import the Nephele data, which includes a biom file, a phylogenetic tree file, and the metadata file (Nephele mapping file). file.

(Note: Test data can be found in a private environment variable called with MicrobiomeR:::pkg.private)

# Get the data files from package
input_files <- pkg.data$input_files
biom_file <- input_files$biom_files$silva # Path to silva biom file
tree_file <- input_files$tree_files$silva # Path to silva tree file
metadata_file <- input_files$metadata$two_groups # Path to Nephele metadata
parse_func <- parse_taxonomy_silva_128 # A custom phyloseq parsing function for silva annotations

# Get the phyloseq object
phyloseq_object <- create_phyloseq(
  biom_file = biom_file,
  tree_file = tree_file,
  metadata_file = metadata_file,
  parse_func = parse_func
)

Formatting

After importing amplicon data with phyloseq, you will need to convert the phyloseq object to a taxa::taxmap object for use with the metacoder package. MicrobiomeR has added additional “formatting” to the taxmap object in the form of specific tables names. For more information please consult the Formatting help page in your R console via ?MicrobiomeR::MicrobiomeR_Formats.

p_obj <- phyloseq_object
# Create the various formats
phy_format <- as_MicrobiomeR_format(obj = p_obj, format = "phyloseq_format")
raw_format <- as_MicrobiomeR_format(obj = p_obj, format = "raw_format")
basic_format <- as_MicrobiomeR_format(obj = p_obj, format = "basic_format")
analyzed_format <- as_MicrobiomeR_format(obj = p_obj, format = "analyzed_format")

The various formats that have been defined include the “phyloseq”, “raw”, “basic”, and “analyzed” formats. These formats define the level or stage that the taxmap object is in. The only difference between MicrobiomeR taxmap objects and regular taxmap objects is the naming convention used for the observation data (e.g. names(MicrobiomeR_obj$data)).

# Show the difference in MicrobiomeR formats
print(names(phy_format$data))
#> [1] "sample_data" "phy_tree"    "otu_table"   "tax_data"

print(names(raw_format$data))
#> [1] "otu_abundance"   "otu_annotations" "sample_data"     "phy_tree"

print(names(basic_format$data))
#> [1] "otu_abundance"    "otu_annotations"  "otu_proportions" 
#> [4] "sample_data"      "phy_tree"         "taxa_abundance"  
#> [7] "taxa_proportions"

print(names(analyzed_format$data))
#> [1] "otu_abundance"    "otu_annotations"  "otu_proportions" 
#> [4] "sample_data"      "phy_tree"         "taxa_abundance"  
#> [7] "taxa_proportions" "statistical_data" "stats_tax_data"

MicrobiomeR provides a group of helpful functions that can be used to format taxmap objects including:

which_format() for returning the format of the taxmap object.
is_*_format() for testing if an object is in a specific format.
as_*_format() for converting an object to a specific format.
order_metacoder_data() for ordering the observation data for analysis.
validate_MicrobiomeR_format() for validating that the taxmap object is in a specific format.

# Determine format
which_format(analyzed_format)
#> [1] "analyzed_format"

is_analyzed_format(analyzed_format)
#> [1] TRUE

# Forcing validation
low_level_format <- validate_MicrobiomeR_format(obj = raw_format, valid_formats = c("analyzed_format", "basic_format"), force_format = TRUE, min_or_max = min)
which_format(low_level_format)
#> [1] "basic_format"

high_level_format <- validate_MicrobiomeR_format(obj = raw_format, valid_formats = c("analyzed_format", "basic_format"), force_format = TRUE, min_or_max = max)
which_format(high_level_format)
#> [1] "analyzed_format"

However, most of the time you will simply use as_MicrobiomeR_format(format=...) to get an object into the proper format.

(Please note that MicrobiomeR expects the data to be imported with phyloseq and then converted to a taxmap object)

Data Wrangling

Robert Gilmore

2019-04-01

Import

Formatting

Contents