Introduction

We can use masscleaner for missing value (MV) imputation.

First, we need to prepare samples for masscleaner.

Data preparation

Load the data in previous step.

load("peak_tables/POS/object")
get_mv_number(object)
#> [1] 0
head(massdataset::get_mv_number(object, by = "sample"))
#> sample_QC_01    sample_01    sample_02    sample_06    sample_07    sample_11 
#>            0            0            0            0            0            0
head(massdataset::get_mv_number(object, by = "variable"))
#>  M70T53_POS M70T527_POS M71T775_POS M71T669_POS M71T715_POS  M71T54_POS 
#>           0           0           0           0           0           0

head(massdataset::get_mv_number(object, by = "sample", show_by = "percentage"))
#> sample_QC_01    sample_01    sample_02    sample_06    sample_07    sample_11 
#>            0            0            0            0            0            0
head(massdataset::get_mv_number(object, by = "variable"), show_by = "percentage")
#>  M70T53_POS M70T527_POS M71T775_POS M71T669_POS M71T715_POS  M71T54_POS 
#>           0           0           0           0           0           0

Impute missing values

zero

object_zero = 
  impute_mv(object = object, method = "zero")
get_mv_number(object_zero)
#> [1] 0

KNN

object = 
  impute_mv(object = object, method = "knn")
get_mv_number(object)
#> [1] 0

More methods can be found ?impute_mv().

Note

If there are blank samples in dataset, we use different method to impute missing values.

For Blank samples, just use the zero.

For non-Blank samples, just use the knn or other method

Save data for next analysis.

save(object, file = "peak_tables/POS/object")

Session information

sessionInfo()
#> R Under development (unstable) (2022-01-11 r81473)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur/Monterey 10.16
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] forcats_0.5.1      stringr_1.4.0      dplyr_1.0.7        purrr_0.3.4       
#>  [5] readr_2.1.1        tidyr_1.1.4        tibble_3.1.6       ggplot2_3.3.5     
#>  [9] tidyverse_1.3.1    magrittr_2.0.1     tinytools_0.9.1    massdataset_0.99.1
#> [13] masscleaner_0.9.2 
#> 
#> loaded via a namespace (and not attached):
#>   [1] colorspace_2.0-2      rjson_0.2.21          ellipsis_0.3.2       
#>   [4] class_7.3-20          leaflet_2.0.4.1       rprojroot_2.0.2      
#>   [7] circlize_0.4.14       GlobalOptions_0.1.2   fs_1.5.2             
#>  [10] clue_0.3-60           rstudioapi_0.13       proxy_0.4-26         
#>  [13] ggrepel_0.9.1         lubridate_1.8.0       fansi_1.0.0          
#>  [16] mvtnorm_1.1-2         xml2_1.3.3            codetools_0.2-18     
#>  [19] doParallel_1.0.16     cachem_1.0.6          impute_1.69.0        
#>  [22] robustbase_0.93-8     knitr_1.37            itertools_0.1-3      
#>  [25] jsonlite_1.7.2        broom_0.7.11          dbplyr_2.1.1         
#>  [28] cluster_2.1.2         png_0.1-7             missForest_1.4       
#>  [31] rrcov_1.6-0           compiler_4.2.0        httr_1.4.2           
#>  [34] backports_1.4.1       assertthat_0.2.1      fastmap_1.1.0        
#>  [37] lazyeval_0.2.2        cli_3.1.0             htmltools_0.5.2      
#>  [40] tools_4.2.0           gtable_0.3.0          glue_1.6.0           
#>  [43] Rcpp_1.0.7            Biobase_2.55.0        cellranger_1.1.0     
#>  [46] jquerylib_0.1.4       pkgdown_2.0.1         vctrs_0.3.8          
#>  [49] iterators_1.0.13      crosstalk_1.2.0       xfun_0.29            
#>  [52] rvest_1.0.2           openxlsx_4.2.5        lifecycle_1.0.1      
#>  [55] DEoptimR_1.0-10       MASS_7.3-55           scales_1.1.1         
#>  [58] ragg_1.2.1            pcaMethods_1.87.0     clisymbols_1.2.0     
#>  [61] hms_1.1.1             parallel_4.2.0        RColorBrewer_1.1-2   
#>  [64] ComplexHeatmap_2.11.0 yaml_2.2.1            memoise_2.0.1        
#>  [67] pbapply_1.5-0         yulab.utils_0.0.4     sass_0.4.0           
#>  [70] stringi_1.7.6         S4Vectors_0.33.10     desc_1.4.0           
#>  [73] pcaPP_1.9-74          foreach_1.5.1         randomForest_4.6-14  
#>  [76] e1071_1.7-9           BiocGenerics_0.41.2   zip_2.2.0            
#>  [79] BiocParallel_1.29.10  shape_1.4.6           rlang_0.4.12         
#>  [82] pkgconfig_2.0.3       systemfonts_1.0.3     matrixStats_0.61.0   
#>  [85] evaluate_0.14         lattice_0.20-45       patchwork_1.1.1      
#>  [88] htmlwidgets_1.5.4     tidyselect_1.1.1      robust_0.6-1         
#>  [91] ggsci_2.9             plyr_1.8.6            R6_2.5.1             
#>  [94] IRanges_2.29.1        snow_0.4-4            generics_0.1.1       
#>  [97] fit.models_0.64       DBI_1.1.2             withr_2.4.3          
#> [100] haven_2.4.3           pillar_1.6.4          modelr_0.1.8         
#> [103] crayon_1.4.2          utf8_1.2.2            plotly_4.10.0        
#> [106] tzdb_0.2.0            rmarkdown_2.11        GetoptLong_1.0.5     
#> [109] grid_4.2.0            readxl_1.3.1          data.table_1.14.2    
#> [112] reprex_2.0.1          digest_0.6.29         gridGraphics_0.5-1   
#> [115] textshaping_0.3.6     stats4_4.2.0          munsell_0.5.0        
#> [118] viridisLite_0.4.0     ggplotify_0.1.0       bslib_0.3.1