BBB Research: Analysis of High-Dimensional Data with Applications

Branch investigators are actively engaged in developing a variety of applicable methods for analyzing and modeling high-dimensional data in a variety of contexts, such as actigraphy, chemical exposure studies, quantitative High Throughput Screening Assays (qHTS), genetics, epigenetics, various -omics, and modern image analysis. We develop both frequentist and Bayesian methods for analyzing and modeling high-dimensional data. These methods, often motivated by data generated by investigators in other DIPHR branches, are general and may be broadly applicable to other data.

In addition to developing methods, in some cases, our investigators also develop publicly available, user-friendly software and codes for free download. Some examples include our Analysis of Compositions of Microbiomes (ANCOM) software for comparing the abundance of individual taxa in two populations using log-ratios of abundance and our Order-Restricted Inference for Ordered Gene ExpressioN (ORIOGEN) software for analyzing multi-group multi-feature gene expression data.

Currently, due to the applications we see in our division, some of our investigators are expanding their research interests and expertise in areas such as machine learning and other areas of artificial intelligence.

Principal Investigators

Zhen Chen, Ph.D.; Aiyi Liu, Ph.D.; and Rajeshwari Sundaram, M.Stat., Ph.D.

Selected Publications

Hwang, B. S., Chen, Z. M., Buck, L. G. M., & Albert, P. S. (2019). A Bayesian multi-dimensional couple-based latent risk model with an application to infertility. Biometrics, 75(1), 315-325. PMID: 30267541

Zhang, W., Chen, Z., Liu, A., & Buck, L. G. M. (2019). A weighted kernel machine regression approach to environmental pollutants and infertility. Statistics in Medicine, 38(5), 809-827. PMID: 30328128

Lum, K. J., Sundaram, R., Barr, D. B., Louis, T. A., & Buck, L. G. M. (2017). Perfluoroalkyl chemicals, menstrual cycle length, and fecundity: findings from a prospective pregnancy study. Epidemiology, 28(1), 90-98. PMID: 27541842. PMCID: PMC5131715

Reese, S. E., Zhao, S., Wu, M. C., Joubert, B. R., Parr, C. L., Håberg, S. E., Ueland, P. M., Nilsen, R. M., Midttun, O., Vollset, S. E., Peddada, S. D., Nystad, W., & London, S. J. (2017). DNA methylation score as a biomarker in newborns for sustained maternal smoking during pregnancy. Environmental Health Perspectives, 125(4), 760-766. PMID: 27323799

Joubert, B. R., den Dekker, H. T., Felix, J. F., Bohlin, J., Ligthart, S., Beckett, E., Tiemeier, H., … Peddada, S. D., Jaddoe, V. W. V., Nystad, W., Duijts, L., & London, S. J. (2016). Maternal plasma folate impacts differential DNA methylation in an epigenome-wide meta-analysis of newborns. Nature Communications, 7, 105-177. PMID: 26861414

Lim, C., Sen, P. K., & Peddada, S. D. (2013). Robust analysis of high throughput screening assays. Technometrics, 55, 150-160.

Fernandez, M., Rueda, C., & Peddada, S. D. (2012). Identification of a core set of signature cell-cycle genes whose relative order of time to peak expression is conserved across species. Nucleic Acids Research, 40(7), 2823-2832. PMID: 22135306

Peddada, S. D., Lobenhofer, E. K., Li, L., Afshari, C. A., Weinberg, C. R., & Umbach, D. M. (2003). Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference. Bioinformatics (Oxford, England), 19(7), 834-841. PMID: 12724293