Algorithms for Analysis of Complicated Phenomena
DECISION SCIENCES |

Algorithms for Analysis of Complicated Phenomena

ANTONIO LIJOI AND COLLEAGUES PROPOSE A MODEL TO STUDY HETEROGENEOUS DATA

Non-parametric Bayesian inference is a flexible and effective approach for analyzing complex phenomena. It has been proven successful in several applied fields ranging from genomics to functional data analysis, from clinical trials to topic modeling, just to mention a few. Some recent interesting developments have been achieved in the analysis of data arising from DNA sequencing where non-parametric Bayesian models yield simple and intuitive tools to predict the number of new genes that would be discovered in an additional sample by analyzing only a fraction of a genomic library or to estimate the so-called sample coverage.
 
Things get much more complicated when heterogeneous data are available as in the case of DNA sequences coming from different tissues of an organism. This is the problem that Antonio Lijoi, Igor Prünster, Federico Camerlenghi, and Peter Orbanz face in Distribution Theory for Hierarchical Processes, forthcoming on Annals of Statistics. The authors propose a general model for data that are affected by a source of heterogeneity: this is the typical setting that characterizes meta-analysis experiments and is of great interest in machine learning applications. Patients treated in different hospitals or documents issued by different areas of the same organization are both examples of heterogeneous populations that share common features.
 
“Hierarchical processes are useful to address this problem. They arise as the composition of discrete random probability measures”, Antonio Lijoi says. “Our paper presents novel theoretical results and describes two classes of algorithms that can be readily implemented. On the one hand, the so-called ‘marginal’ algorithms provide approximate samples from predictive laws in heterogeneous populations. On the other hand, ‘conditional’ algorithms generate realizations of the underlying random probability measures conditionally on the data. They allow us not only to make predictions, but also yield a more reliable evaluation of the uncertainty associated with them”.
 
Some promising new developments in this study concern survival analysis in the presence of covariate-dependent data.

Read more about this topic:
Riccardo Zecchina. Teaching Machines How to Learn to Improve Business and Life
Carlo Baldassi. Learning Is a Quantum Question
Daniele Durante. How to Study Dynamic Networks
Dirk Hovy. The algorithm that Prevents Suicide
Alessia Melegaro. The Network Modeling Chip that Fights Influenza
Raffaella Piccarreta, Marco Bonetti. A (Statistical) Model for Life
 

by Claudio Todesco
Bocconi Knowledge newsletter

News

  • How COVID Has Changed the Way We Live and Work

    At the end of its first year of activity, Bocconi's SUR Lab publishes a position paper on the new living and working models emerging from the pandemic  

  • How to Predict a 5 Star Review

    Wordify, an online tool developed by Bocconi Data and Marketing Insights Research Unit, helps discover relations between the words used in texts, like online reviews, and their labels, like ratings  

Seminars

  May 2022  
Mon Tue Wed Thu Fri Sat Sun
            1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31          

Seminars

  • Attila Lindner: Firm Heterogeneity and the Impact of Payroll Taxes
    Development Labor Political Economy

    for further information contact patrizia.pellizzari@unibocconi.it

    Room 24, floor 2, Via Sarfatti 25

  • A platform for change? The impact of core component innovations in a platform-based ecosystem on complementor actions
    A platform for change? The impact of core component innovations in a platform-based ecosystem on complementor actions

    RAM RANGANATHAN - UT Austin

    Room AS02 (-1 Roentgen)