论文题目:A Phylogeny-Regularized Sparse Regression Model for Predictive Modeling of Microbial Community Data
scholar 引用:1
页数:12
发表时间:2018.12
发表刊物:Frontiers in Microbiology
作者:Jian Xiao, Li Chen, ..., and Jun Chen
摘要:
Fueled by technological advancement, there has been a surge of human microbiome studies surveying the microbial communities associated with the human body and their links with health and disease. As a complement to the human genome, the human microbiome holds great potential for precision medicine. Efficient predictive models based on microbiome data could be potentially used in various clinical applications such as disease diagnosis, patient stratification and drug response prediction. One important characteristic of the microbial community data is the phylogenetic tree that relates all the microbial taxa based on their evolutionary history. The phylogenetic tree is an informative prior for more efficient prediction since the microbial community changes are usually not randomly distributed on the tree but tend to occur in clades at varying phylogenetic depths (clustered signal). Although community-wide changes are possible for some conditions, it is also likely that the community changes are only associated with a small subset of “marker” taxa (sparse signal). Unfortunately, predictive models of microbial community data taking into account both the sparsity and the tree structure remain under-developed. In this paper, we propose a predictive framework to exploit sparse and clustered microbiome signals using a phylogeny-regularized sparse regression model. Our approach is motivated by evolutionary theory, where a natural correlation structure among microbial taxa exists according to the phylogenetic relationship. A novel phylogeny-based smoothness penalty is proposed to smooth the coefficients of the microbial taxa with respect to the phylogenetic tree. Using simulated and real datasets, we show that our method achieves better prediction performance than competing sparse regression methods for sparse and clustered microbiome signals.
正文组织架构:
1. Introduction
2. Methods
2.1 A Phylogeny-Induced Correlation Structure Among OTUs
2.2 Phylogeny-Regularized Sparse Generalized Linear Model
2.3 Connection With Existing Methods
2.4 Some Theoretical Properties
2.5 Model Estimation and Computational Complexity
3. Simulation studies
3.1 Simulation Strategy
3.1.1 Simulating OTU Abundance Data
3.1.2 Selecting Outcome-Associated OTUs
3.1.3 Generating the Outcome Based on the Outcome-Associated OTUs
3.2 Competing Methods, Model Selection and Evaluation
3.2.1 Competing Methods
3.2.2 Model Selection and Evaluation
3.3 Simulation Results
3.3.1 Results for Continuous-Outcome Data
3.3.2 Results for Binary-Outcome Data
3.3.3 Comparison to SLS With Different Sparsity Levels in the Laplacian Matrix
4. Real Data Applications
4.1 Caffeine Intake Data
4.2 Smoking Data
5. Discussion
正文部分内容摘录:
1. Biological Problem: What biological problems have been solved in this paper?
2. Main discoveries: What is the main discoveries in this paper?
3. ML(Machine Learning) Methods: What are the ML methods applied in this paper?
Caffeine Intake Data: 98 samples and 499 OTUs
Smoking Data: 32 non-smokers and 28 smokers with 174 OTUs
4. ML Advantages: Why are these ML methods better than the traditional methods in these biological problems?
5. Biological Significance: What is the biological significance of these ML methods’ results?
6. Prospect: What are the potential applications of these machine learning methods in biological science?
本文发布于:2024-02-04 07:14:38,感谢您对本站的认可!
本文链接:https://www.4u4v.net/it/170701873053495.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
留言与评论(共有 0 条评论) |