vignettes/missing_value_imputation.Rmd
missing_value_imputation.Rmd
We can use masscleaner
for missing value (MV)
imputation.
First, we need to prepare samples for masscleaner
.
Load the data in previous step.
load("peak_tables/POS/object")
get_mv_number(object)
#> [1] 0
head(massdataset::get_mv_number(object, by = "sample"))
#> sample_QC_01 sample_01 sample_02 sample_06 sample_07 sample_11
#> 0 0 0 0 0 0
head(massdataset::get_mv_number(object, by = "variable"))
#> M70T53_POS M70T527_POS M71T775_POS M71T669_POS M71T715_POS M71T54_POS
#> 0 0 0 0 0 0
head(massdataset::get_mv_number(object, by = "sample", show_by = "percentage"))
#> sample_QC_01 sample_01 sample_02 sample_06 sample_07 sample_11
#> 0 0 0 0 0 0
head(massdataset::get_mv_number(object, by = "variable"), show_by = "percentage")
#> M70T53_POS M70T527_POS M71T775_POS M71T669_POS M71T715_POS M71T54_POS
#> 0 0 0 0 0 0
object_zero =
impute_mv(object = object, method = "zero")
get_mv_number(object_zero)
#> [1] 0
object =
impute_mv(object = object, method = "knn")
get_mv_number(object)
#> [1] 0
More methods can be found ?impute_mv()
.
If there are blank samples in dataset, we use different method to impute missing values.
For Blank samples, just use the zero.
For non-Blank samples, just use the knn or other method
Save data for next analysis.
save(object, file = "peak_tables/POS/object")
sessionInfo()
#> R Under development (unstable) (2022-01-11 r81473)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur/Monterey 10.16
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4
#> [5] readr_2.1.1 tidyr_1.1.4 tibble_3.1.6 ggplot2_3.3.5
#> [9] tidyverse_1.3.1 magrittr_2.0.1 tinytools_0.9.1 massdataset_0.99.1
#> [13] masscleaner_0.9.2
#>
#> loaded via a namespace (and not attached):
#> [1] colorspace_2.0-2 rjson_0.2.21 ellipsis_0.3.2
#> [4] class_7.3-20 leaflet_2.0.4.1 rprojroot_2.0.2
#> [7] circlize_0.4.14 GlobalOptions_0.1.2 fs_1.5.2
#> [10] clue_0.3-60 rstudioapi_0.13 proxy_0.4-26
#> [13] ggrepel_0.9.1 lubridate_1.8.0 fansi_1.0.0
#> [16] mvtnorm_1.1-2 xml2_1.3.3 codetools_0.2-18
#> [19] doParallel_1.0.16 cachem_1.0.6 impute_1.69.0
#> [22] robustbase_0.93-8 knitr_1.37 itertools_0.1-3
#> [25] jsonlite_1.7.2 broom_0.7.11 dbplyr_2.1.1
#> [28] cluster_2.1.2 png_0.1-7 missForest_1.4
#> [31] rrcov_1.6-0 compiler_4.2.0 httr_1.4.2
#> [34] backports_1.4.1 assertthat_0.2.1 fastmap_1.1.0
#> [37] lazyeval_0.2.2 cli_3.1.0 htmltools_0.5.2
#> [40] tools_4.2.0 gtable_0.3.0 glue_1.6.0
#> [43] Rcpp_1.0.7 Biobase_2.55.0 cellranger_1.1.0
#> [46] jquerylib_0.1.4 pkgdown_2.0.1 vctrs_0.3.8
#> [49] iterators_1.0.13 crosstalk_1.2.0 xfun_0.29
#> [52] rvest_1.0.2 openxlsx_4.2.5 lifecycle_1.0.1
#> [55] DEoptimR_1.0-10 MASS_7.3-55 scales_1.1.1
#> [58] ragg_1.2.1 pcaMethods_1.87.0 clisymbols_1.2.0
#> [61] hms_1.1.1 parallel_4.2.0 RColorBrewer_1.1-2
#> [64] ComplexHeatmap_2.11.0 yaml_2.2.1 memoise_2.0.1
#> [67] pbapply_1.5-0 yulab.utils_0.0.4 sass_0.4.0
#> [70] stringi_1.7.6 S4Vectors_0.33.10 desc_1.4.0
#> [73] pcaPP_1.9-74 foreach_1.5.1 randomForest_4.6-14
#> [76] e1071_1.7-9 BiocGenerics_0.41.2 zip_2.2.0
#> [79] BiocParallel_1.29.10 shape_1.4.6 rlang_0.4.12
#> [82] pkgconfig_2.0.3 systemfonts_1.0.3 matrixStats_0.61.0
#> [85] evaluate_0.14 lattice_0.20-45 patchwork_1.1.1
#> [88] htmlwidgets_1.5.4 tidyselect_1.1.1 robust_0.6-1
#> [91] ggsci_2.9 plyr_1.8.6 R6_2.5.1
#> [94] IRanges_2.29.1 snow_0.4-4 generics_0.1.1
#> [97] fit.models_0.64 DBI_1.1.2 withr_2.4.3
#> [100] haven_2.4.3 pillar_1.6.4 modelr_0.1.8
#> [103] crayon_1.4.2 utf8_1.2.2 plotly_4.10.0
#> [106] tzdb_0.2.0 rmarkdown_2.11 GetoptLong_1.0.5
#> [109] grid_4.2.0 readxl_1.3.1 data.table_1.14.2
#> [112] reprex_2.0.1 digest_0.6.29 gridGraphics_0.5-1
#> [115] textshaping_0.3.6 stats4_4.2.0 munsell_0.5.0
#> [118] viridisLite_0.4.0 ggplotify_0.1.0 bslib_0.3.1