Material Detail
Boosting statistical network inference by incorporating prior knowledge from multiple sources
This video was recorded at 6th International Workshop on Machine Learning in Systems Biology (MLSB), Basel 2012. Statistical learning methods, such as Bayesian Networks, have gained a high popularity to infer cellular networks from high throughput experiments. However, the inherent noise in experimental data together with the typical low sample size limits their performance with high false positives and false negatives. Incorporating prior knowledge into the learning process has thus been identified as a way to address this problem, and principle a mechanism for doing so has been devised (Mukherjee & Speed, 2008). However, so far little attention has been paid to the fact that prior knowledge is typically distributed among multiple, heterogeneous knowledge sources (e.g. GO, KEGG, HPRD, etc.). Here we propose two methods for constructing an informative network prior from multiple knowledge sources: Our first model is a latent factor model using Bayesian inference. Our second model is the Noisy-OR model, which assumes that the overall prior is a non-deterministic effect of participating information sources. Both models are compared to a naïve method, which assumes independence of knowledge sources. Extensive simulation studies on artificially created networks as well as full KEGG pathways reveal a significant improvement of both suggested methods compared to the naïve model. The performance of the latent factor model increases with larger network sizes, whereas for smaller networks the Noisy-OR model appears superior.
Quality
- User Rating
- Comments
- Learning Exercises
- Bookmark Collections
- Course ePortfolios
- Accessibility Info