vignettes/align_batch.Rmd
align_batch.Rmd
Some times if you have more than two batch peak tables, and they are
processed at different times, so you need to align them together before
other analysis. In masscleaner
, we use the
align_batch()
function to do that.
See the massdataset package, and create you metabolomics dataset into 2 mass_dataset objects.
Here we use the demo data from demodata
package, so
please install it first.
if(!require(devtools)){
install.packages("devtools")
}
devtools::install_github("tidymass/demodata")
library(masscleaner)
library(demodata)
library(tidyverse)
align_batch()
function
data(object1, package = "demodata")
data(object2, package = "demodata")
object1
#> --------------------
#> massdataset version: 0.01
#> --------------------
#> 1.expression_data:[ 500 x 217 data.frame]
#> 2.sample_info:[ 217 x 5 data.frame]
#> 3.variable_info:[ 500 x 3 data.frame]
#> 4.sample_info_note:[ 5 x 2 data.frame]
#> 5.variable_info_note:[ 3 x 2 data.frame]
#> 6.ms2_data:[ 0 variables x 0 MS2 spectra]
#> --------------------
#> Processing information (extract_process_info())
#> Creation ----------
#> Package Function.used Time
#> 1 massdataset create_mass_dataset() 2021-12-23 00:21:08
object2
#> --------------------
#> massdataset version: 0.01
#> --------------------
#> 1.expression_data:[ 500 x 217 data.frame]
#> 2.sample_info:[ 217 x 5 data.frame]
#> 3.variable_info:[ 500 x 3 data.frame]
#> 4.sample_info_note:[ 5 x 2 data.frame]
#> 5.variable_info_note:[ 3 x 2 data.frame]
#> 6.ms2_data:[ 0 variables x 0 MS2 spectra]
#> --------------------
#> Processing information (extract_process_info())
#> Creation ----------
#> Package Function.used Time
#> 1 massdataset create_mass_dataset() 2021-12-23 00:21:09
x = object1
y = object2
match_result =
align_batch(x = object1, y = object2, return_index = TRUE)
head(match_result)
#> variable_id1 variable_id2 Index1 Index2 mz1 mz2 mz.error
#> 1 M72T38 M72T57 3 2 72.08070 72.08080 1.2828676
#> 2 M86T95 M86T93 6 4 86.09642 86.09640 0.2352014
#> 3 M86T75 M86T75 7 5 86.09649 86.09654 0.5543780
#> 4 M90T649_1 M90T660 8 7 89.50705 89.50695 1.0972320
#> 5 M100T151 M100T152 10 8 100.07569 100.07576 0.6575023
#> 6 M104T31 M104T31 12 9 104.10723 104.10739 1.4754018
#> rt1 rt2 rt.error int1 int2 int.error
#> 1 37.7015 56.5210 18.8195 6.458591 6.411066 0.047524752
#> 2 94.9910 92.5885 2.4025 6.881627 6.873169 0.008457544
#> 3 74.5230 74.6850 0.1620 6.461618 6.417154 0.044464658
#> 4 648.8800 659.5660 10.6860 6.191982 6.216757 0.024775041
#> 5 151.0180 151.6195 0.6015 6.890064 6.744205 0.145858938
#> 6 30.8090 30.9210 0.1120 7.105963 6.977168 0.128795191
new_object =
align_batch(x = object1, y = object2, return_index = FALSE)
new_object
#> --------------------
#> massdataset version: 0.99.1
#> --------------------
#> 1.expression_data:[ 354 x 434 data.frame]
#> 2.sample_info:[ 434 x 5 data.frame]
#> 3.variable_info:[ 354 x 3 data.frame]
#> 4.sample_info_note:[ 5 x 2 data.frame]
#> 5.variable_info_note:[ 3 x 2 data.frame]
#> 6.ms2_data:[ 0 variables x 0 MS2 spectra]
#> --------------------
#> Processing information (extract_process_info())
#> Creation ----------
#> Package Function.used Time
#> 1 massdataset create_mass_dataset() 2021-12-23 00:21:08
#> subset ----------
#> Package Function.used Time
#> 1 massdataset [ 2022-01-14 10:10:28
#> Creation ----------
#> Package Function.used Time
#> 1 massdataset create_mass_dataset() 2021-12-23 00:21:09
#> subset ----------
#> Package Function.used Time
#> 1 massdataset [ 2022-01-14 10:10:28
sessionInfo()
#> R Under development (unstable) (2022-01-11 r81473)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur/Monterey 10.16
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4
#> [5] readr_2.1.1 tidyr_1.1.4 tibble_3.1.6 ggplot2_3.3.5
#> [9] tidyverse_1.3.1 demodata_0.0.1 magrittr_2.0.1 tinytools_0.9.1
#> [13] massdataset_0.99.1 masscleaner_0.9.2
#>
#> loaded via a namespace (and not attached):
#> [1] colorspace_2.0-2 rjson_0.2.21 ellipsis_0.3.2
#> [4] class_7.3-20 leaflet_2.0.4.1 rprojroot_2.0.2
#> [7] circlize_0.4.14 GlobalOptions_0.1.2 fs_1.5.2
#> [10] clue_0.3-60 rstudioapi_0.13 proxy_0.4-26
#> [13] ggrepel_0.9.1 lubridate_1.8.0 fansi_1.0.0
#> [16] mvtnorm_1.1-2 xml2_1.3.3 codetools_0.2-18
#> [19] doParallel_1.0.16 cachem_1.0.6 impute_1.69.0
#> [22] robustbase_0.93-8 knitr_1.37 itertools_0.1-3
#> [25] jsonlite_1.7.2 broom_0.7.11 dbplyr_2.1.1
#> [28] cluster_2.1.2 png_0.1-7 missForest_1.4
#> [31] rrcov_1.6-0 compiler_4.2.0 httr_1.4.2
#> [34] backports_1.4.1 assertthat_0.2.1 fastmap_1.1.0
#> [37] lazyeval_0.2.2 cli_3.1.0 htmltools_0.5.2
#> [40] tools_4.2.0 gtable_0.3.0 glue_1.6.0
#> [43] Rcpp_1.0.7 Biobase_2.55.0 cellranger_1.1.0
#> [46] jquerylib_0.1.4 pkgdown_2.0.1 vctrs_0.3.8
#> [49] iterators_1.0.13 crosstalk_1.2.0 xfun_0.29
#> [52] rvest_1.0.2 openxlsx_4.2.5 lifecycle_1.0.1
#> [55] DEoptimR_1.0-10 MASS_7.3-55 scales_1.1.1
#> [58] ragg_1.2.1 pcaMethods_1.87.0 clisymbols_1.2.0
#> [61] hms_1.1.1 parallel_4.2.0 RColorBrewer_1.1-2
#> [64] ComplexHeatmap_2.11.0 yaml_2.2.1 memoise_2.0.1
#> [67] pbapply_1.5-0 yulab.utils_0.0.4 sass_0.4.0
#> [70] stringi_1.7.6 S4Vectors_0.33.10 desc_1.4.0
#> [73] pcaPP_1.9-74 foreach_1.5.1 randomForest_4.6-14
#> [76] e1071_1.7-9 BiocGenerics_0.41.2 zip_2.2.0
#> [79] BiocParallel_1.29.10 shape_1.4.6 rlang_0.4.12
#> [82] pkgconfig_2.0.3 systemfonts_1.0.3 matrixStats_0.61.0
#> [85] evaluate_0.14 lattice_0.20-45 patchwork_1.1.1
#> [88] htmlwidgets_1.5.4 tidyselect_1.1.1 robust_0.6-1
#> [91] ggsci_2.9 plyr_1.8.6 R6_2.5.1
#> [94] IRanges_2.29.1 snow_0.4-4 generics_0.1.1
#> [97] fit.models_0.64 DBI_1.1.2 withr_2.4.3
#> [100] haven_2.4.3 pillar_1.6.4 modelr_0.1.8
#> [103] crayon_1.4.2 utf8_1.2.2 plotly_4.10.0
#> [106] tzdb_0.2.0 rmarkdown_2.11 GetoptLong_1.0.5
#> [109] grid_4.2.0 readxl_1.3.1 data.table_1.14.2
#> [112] reprex_2.0.1 digest_0.6.29 gridGraphics_0.5-1
#> [115] textshaping_0.3.6 stats4_4.2.0 munsell_0.5.0
#> [118] viridisLite_0.4.0 ggplotify_0.1.0 bslib_0.3.1