HYPOTHESIS TESTING

Multivariate Independence and k-sample Testing

Panda S.
Masters Thesis · Johns Hopkins University · BME · 2020
In one sentence — ai-generated summary

Two years of work on multivariate hypothesis testing — distance correlation and MGC, and Nonparametric MANOVA — pulled into one framework, with the open-source hyppo package as its software contribution.

Abstract

With the increase in the amount of data in many fields, a method to consistently and efficiently decipher relationships within high dimensional data sets is important. Because many modern datasets are multivariate, univariate tests are not applicable. While many multivariate independence tests have R packages available, the interfaces are inconsistent and most are not available in Python. We introduce hyppo, which includes many state of the art multivariate testing procedures. This thesis provides details for the implementations of each of the tests within a test hyppo as well as extensive power and run-time benchmarks on a suite of high-dimensional simulations previously used in different publications. The documentation and all releases for hyppo are available at https://hyppo.neurodata.io.

Advisor: Joshua T. Vogelstein

BibTeX
@phdthesis{panda2020multivariate,
  title = {Multivariate {{Independence}} and K-Sample {{Testing}}},
  author = {Panda, Sambit},
  year = 2020,
  month = may,
  langid = {american},
  school = {Johns Hopkins University},
}