**Scientific Director**

Istituto "Dalle Molle" di Studi sull'Intelligenza Artificiale (IDSIA)

Polo universitario Lugano

Via la Santa 1

CH-6962 Lugano, Switzerland

Tel.: +41 (0)58 666 666 5

Fax: +41 (0)58 666 666 1

E-mail:
.

Dblp: 08/4633.

Orcid: 0000-0001-8908-1502.

Scholar: ECEcK4cAAAAJ.

Scopus: 6602149061.

Semanticscholar: 1825163.

Last update: April 28, 2024.

Science.

There are three stages in scientific discovery: first, people deny that it is true, then they deny that it is important; finally they credit the wrong person.(Alexander Von Humboldt)Art.

This fellow has a problem ... he considers boredom an Art.(Charles Bukowski)Ethics.

An ethicist is someone who sees something wrong with whatever you have in mind.(Marvin Minsky)Love.

A love relation's stages: don't ever change; you should change; you have changed.(Anonymous)Naive realism.

Have you ever noticed that anybody driving slower than you is an idiot, and anyone going faster than you is a maniac?(George Carlin)Agency.

Free will is the biology we haven't discovered yet.(Robert Sapolsky)Religion.

Thanks to god for making me an atheist.(Ricky Gervais)ChatGPT.

The "I" in LLM stands for intelligence.(Daniel Stenberg)Hanlon's razor.

Don't ascribe to malice what can be plainly explained with incompetence.(Robert J. Hanlon)

- M.Sc. degree cum laude in Computer Science in 1991 and Ph.D. degree in Applied Mathematics in 1997, both from Università degli Studi di Milano, Italy.
- At IDSIA since 1997. Founder and head of the Imprecise Probability Group.
- Co-chair of the conferences ISIPTA '03, ISIPTA '07, and chair of the 1st SIPTA School on Imprecise Probabilities.
- Formerly: founding Member, Secretary, and President of SIPTA.
- Senior Area Editor for imprecise probabilities at IJAR since 2018 (Area Editor during 2005–2017).
- SUPSI Professor since 2009.
- Lecturer: Causal Inference, 2018–2020 [book: Primer; slides: all (revised May 2023)]; Uncertain Reasoning and Data Mining, 2008–2019; Probability and Statistics, 2005–2009.
- Program Committee Member for several meetings, including: AAAI, CLearR, ECAI, ECMLPKDD, IPMU, ISIPTA, IJCAI, NeurIPS, PGM, SMPS, UAI, WHY.

Senior Program Committee Member for IJCAI 2021–2023, UAI 2021–2023, AAAI 2022–2023. - Scientific Director of IDSIA since 2019.
- Co-founder and Chief Scientist of Artificialy since 2020.
- Top 2% scientist according to Stanford's science-wide author databases of standardised citation indicators (2019–2022).

Writing.I am writing a longer letter than usual because there is not enough time to write a short one.(Blaise Pascal)

- Antonucci, A., Piqué, G., Zaffalon, M. (2024). Zero-shot causal graph extrapolation from text via LLMs. In
*XAI4Sci-24*@ AAAI 2024. - Kanjirangat, V., Antonucci, A., Zaffalon, M. (2024). On the limitations of zero-shot classification of causal relations by LLMs. In Campos, R., Jorge, A. M., Jatowt, A., Bhatia, S., Litvak, M. (Eds),
*Text2Story'24*@ ECIR 2024, CEUR Workshop Proceedings**3671**, 151–160. - Zaffalon, M., Antonucci, A., Cabañas, R., Huber, D., Azzimonti, D. (2024). Efficient computation of counterfactual bounds.
*International Journal of Approximate Reasoning***tbd**, 109111. - Benavoli, A., Facchini, A., Zaffalon, M. (2023). Closure operators, classifiers and desirability. In Quaeghebeur, E., Miranda, E., Montes, I., Vantaggi, B. (Eds),
*ISIPTA '23*. PMLR**215**, JMLR.org, 25–36. - Casanova, A., Benavoli, A., Zaffalon, M. (2023). Nonlinear desirability as a linear classification problem.
*International Journal of Approximate Reasoning***152**, 1–32. - Huber, D., Chen, Y., Antonucci, A., Darwiche, A. and Zaffalon, M. (2023). Tractable bounding of counterfactual queries by knowledge compilation. In
*TPM-23*@ UAI 2023. - Miranda, E., Zaffalon, M. (2023). Nonlinear desirability theory.
*International Journal of Approximate Reasoning***154**, 176–199. - Schürch, M., Azzimonti, D., Benavoli, A., Zaffalon, M. (2023). Correlated product of experts for sparse Gaussian process regression.
*Machine Learning***112**, 1411–1432. - Zaffalon, M., Antonucci, A., Cabañas, R., Huber, D. (2023). Approximating counterfactual bounds while fusing observational, biased and randomised data sources.
*International Journal of Approximate Reasoning***162**, 109023. - Zaffalon, M., Antonucci, A., Szher, M. (2023). Solving the Allais paradox by counterfactual harm. In Quaeghebeur, E., Miranda, E., Montes, I., Vantaggi, B. (Eds),
*ISIPTA '23*. - Benavoli, A., Facchini, A., Zaffalon, M. (2022). Quantum indistinguishability through exchangeability.
*International Journal of Approximate Reasoning***151**, 389–412. - Casanova, A., Kohlas, J., Zaffalon, M. (2022). Information algebras in the theory of imprecise probabilities.
*International Journal of Approximate Reasoning***142**, 383–416. - Casanova, A., Kohlas, J., Zaffalon, M. (2022). Information algebras in the theory of imprecise probabilities, an extension.
*International Journal of Approximate Reasoning***150**, 311–336. - Zaffalon, M., Antonucci, A., Cabañas, R., Huber, D., Azzimonti, D. (2022). Bounding counterfactuals under selection bias. In Salmerón, A., Rumí, R. (Eds),
*PGM 2022*. PMLR**186**, JMLR.org, 289–300. - Benavoli, A., Facchini, A., Zaffalon, M. (2021). Quantum indistinguishability through exchangeable desirable gambles. In de Bock, J., Cano, A., Miranda, E., Moral, S. (Eds),
*ISIPTA '21*. PMLR**147**, JMLR.org, 22–31. - Benavoli, A., Facchini, A., Zaffalon, M. (2021). The weirdness theorem and the origin of quantum paradoxes.
*Foundations of Physics***51**(5), 95. - Casanova, A., Benavoli, A., Zaffalon, M. (2021). Nonlinear desirability as a linear classification problem. In de Bock, J., Cano, A., Miranda, E., Moral, S. (Eds),
*ISIPTA '21*. PMLR**147**, JMLR.org, 61–71. - Casanova, A., Kohlas, J., Zaffalon, M. (2021). Algebras of sets and coherent sets of gambles. In Vejnarová, J., Wilson, N. (Eds),
*ECSQARU 2021*. Lecture Notes in Artificial Intelligence**12897**, Springer, 603–615. - Casanova, A., Miranda, E., Zaffalon, M. (2021). Joint desirability foundations of social choice and opinion pooling.
*Annals of Mathematics and Artificial Intelligence***89**(10–11), 965–1011. - Corani, G., Benavoli, A., Zaffalon, M. (2021). Time series forecasting with Gaussian processes needs priors. In
*ECML PKDD 2021*. Lecture notes in Computer Science**4**, Springer, 103–117. - Kohlas, J., Casanova, A., Zaffalon, M. (2021). Information algebras of coherent sets of gambles in general possibility spaces. In de Bock, J., Cano, A., Miranda, E., Moral, S. (Eds),
*ISIPTA '21*. PMLR**147**, JMLR.org, 191–200. - Zaffalon, M., Miranda, E. (2021). Desirability foundations of robust rational decision making.
*Synthese***198**(27), S6529–S6570 (published online in 2018). - Zaffalon, M., Antonucci, A., Cabañas, R. (2021). Causal expectation-maximisation. In
*WHY-21*@ NeurIPS 2021. - Zaffalon, M., Miranda, E. (2021). The sure thing. In de Bock, J., Cano, A., Miranda, E., Moral, S. (Eds),
*ISIPTA '21*. PMLR**147**, JMLR.org, 342–351. - Cabañas, R., Antonucci, A., Huber, D., Zaffalon, M. (2020). CREDICI: a Java library for causal inference by credal networks. In Jaeger, M., Nielsen, T. D. (Eds),
*PGM 2020*. PMLR**138**, JMLR.org, 597–600. - Casanova, A., Miranda, E., Zaffalon, M. (2020). Social pooling of beliefs and values with desirability. In
*FLAIRS-33*. AAAI, 569–574. - Corani, G., Azzimonti, G., Augusto, J., Zaffalon, M. (2020). Probabilistic reconciliation of hierarchical forecast via Bayes' rule. In Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (Eds),
*ECML PKDD 2020*. Lecture notes in Computer Science**12459**. Springer, 211–226. - Huber, D., Cabañas, R., Antonucci, A., Zaffalon, M. (2020). CREMA: a Java library for credal network inference. In Jaeger, M., Nielsen, T. D. (Eds),
*PGM 2020*. PMLR**138**, JMLR.org, 613–616. - Kern, H., Corani, G., Huber, D., Vermes, N., Zaffalon, M., Varini, M., Wenzel, C., Fringer, A. (2020). Impact on place of death in cancer patients: a causal exploration in southern Switzerland.
*BMC Palliative Care***19**, 160. - Miranda, E., Zaffalon, M. (2020). Compatibility, desirability, and the running intersection property.
*Artificial Intelligence***283**, 103274. - Schürch, M., Azzimonti, D., Benavoli, A., Zaffalon, M. (2020). Recursive estimation for sparse Gaussian process regression.
*Automatica***120**, 109127. - Yoo, J., Kang, U, Scanagatta, M., Corani, G., Zaffalon, M. (2020). Sampling subgraphs with guaranteed treewidth for accurate and efficient graphical inference. In Caverlee, J., Hu, X., Lalmas, M., Wang, W. (Eds),
*WSDM '20*. ACM, 708–716. - Zaffalon, M., Antonucci, A., Cabañas, R. (2020). Structural causal models are (solvable by) credal networks. In Jaeger, M., Nielsen, T. D. (Eds),
*PGM 2020*. PMLR**138**, JMLR.org, 581–592. - Azzimonti, L., Corani, G., Zaffalon, M. (2019). Hierarchical estimation of parameters in Bayesian networks.
*Computational Statistics and Data Analysis***137**, 67–91. - Benavoli, A., Facchini, A., Piga, D., Zaffalon, M. (2019). Sum-of-squares for bounded rationality.
*International Journal of Approximate Reasoning***105**, 130–152. - Benavoli, A., Facchini, A., Zaffalon, M. (2019). Bernstein's socks, polynomial-time provable coherence and entanglement. In de Bock, J., de Campos, C., de Cooman, G., Quaeghebeur, E., Wheeler, G. (Eds),
*ISIPTA '19*. PMLR**103**, JMLR.org, 23–31. - Miranda, E., Zaffalon, M. (2019). Compatibility, coherence and the RIP. In Destercke, S., Denoeux, T., Gil, M. A., Grzegorzewski, P., Hryniewicz, O. (Eds), SMPS 2018,
*Uncertainty Modelling in Data Science*, Advances in Intelligent and Soft Computing**832**, Springer, 166–174. - De Campos, C. P., Scanagatta, M., Corani, G., Zaffalon, M. (2018). Entropy-based pruning for learning Bayesian networks using BIC.
*Artificial Intelligence***260**, 42–50. - Scanagatta, M., Corani, G., de Campos, C. P., Zaffalon, M. (2018). Approximate structure learning for large Bayesian networks.
*Machine Learning***107**(8–10), 1209–1227. - Scanagatta, M., Corani, G., Zaffalon, M., Yoo, J., Kang, U (2018). Efficient learning of bounded-treewidth Bayesian networks from complete and incomplete data sets.
*International Journal of Approximate Reasoning***95**, 152–166. - Azzimonti, L., Corani, G., Zaffalon, M. (2017). Hierarchical multinomial-Dirichlet model for the estimation of conditional probability tables. In Raghavan, V., Aluru, S., Karypis, G., Miele, L., Wu, X. (Eds),
*ICDM 2017*, IEEE, 739–744. - Benavoli, A., Corani, G., Demšar, J., Zaffalon, M. (2017). Time for a change: a tutorial for comparing multiple classifiers through Bayesian analysis.
*Journal of Machine Learning Research***18**(37), 1–36. - Benavoli, A., Facchini, A., Piga, D., Zaffalon, M. (2017). SOS for bounded rationality. In Antonucci, A., Corani, G., Couso, I., Destercke, S. (Eds),
*ISIPTA '17*. PMLR**62**, JMLR.org, 25–36. - Benavoli, A., Facchini, A., Zaffalon, M. (2017). A Gleason-type theorem for any dimension based on a gambling formulation of quantum mechanics.
*Foundations of Physics***47**(7), 991–1002. - Benavoli, A., Facchini, A., Zaffalon, M., Vicente-Pérez, J. (2017). A polarity theory for sets of desirable gambles. In Antonucci, A., Corani, G., Couso, I., Destercke, S. (Eds),
*ISIPTA '17*. PMLR**62**, JMLR.org, 37–48. - Corani, G., Benavoli, A., Demšar, J., Mangili, F., Zaffalon, M. (2017). Statistical comparison of classifiers through Bayesian hierarchical modelling.
*Machine Learning***106**(11), 1817–1837. - Miranda, E., Zaffalon, M. (2017). Full conglomerability.
*Journal of Statistical Theory and Practice***11**(4), 634–669. - Miranda, E., Zaffalon, M. (2017). Full conglomerability, continuity and marginal extension. In Ferraro, M. B., Giordani, P., Vantaggi, B., Gagolewski, M., Gil, M. A., Grzegorzewski, P., Hryniewicz, O. (Eds), SMPS 2016,
*Soft Methods for Data Science*, Advances in Intelligent and Soft Computing**456**, Springer, 355–362. - Scanagatta, M., Corani, G., Zaffalon, M. (2017). Improved local search in Bayesian networks structure learning. In Hyttinen, A., Malone, B., Suzuki, J. (Eds),
*AMBN 2017*. PMLR**73**, JMLR.org, 45–56. - Zaffalon, M., Miranda, E. (2017). Axiomatising incomplete preferences through sets of desirable gambles.
*Journal of Artificial Intelligence Research***60**, 1057–1126. - Benavoli, A., Facchini, A., Zaffalon, M. (2016). Quantum mechanics: the Bayesian theory generalised to the space of Hermitian matrices.
*Physical Review A***94**, 042106. - Benavoli, A., Facchini, A., Zaffalon, M. (2016). Quantum rational preferences and desirability. In
*Proceedings of the 1st International Workshop on Imperfect Decision Makers: Admitting Real-World Rationality*@ NIPS 2016, PMLR**58**, JMLR.org, 87–96. - De Campos, C. P., Corani, G., Scanagatta, M., Cuccu, M., Zaffalon, M. (2016). Learning extended tree augmented naive structures.
*International Journal of Approximate Reasoning***68**, 153–163. - Miranda, E., Zaffalon, M. (2016). Conformity and independence with coherent lower previsions.
*International Journal of Approximate Reasoning***78**, 125–137. - Rancoita, P. M. V., Zaffalon, M., Zucca, E., Bertoni, F., de Campos, C. P. (2016). Bayesian network data imputation with application to survival tree analysis.
*Computational Statistics and Data Analysis***93**, 373–387. - Scanagatta, M., Corani, G., de Campos, C. P., Zaffalon, M. (2016). Learning treewidth-bounded Bayesian networks with thousands of variables. In Lee, D. D., Sugiyama, M., Luxburg, U. V., Guyon, I., Garnett, R. (Eds),
*NIPS 2016*, 1426–1470. - Scanagatta, M., de Campos, C. P., Corani, G., Zaffalon, M. (2015). Learning Bayesian networks with thousands of variables. In Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., Garnett, R. (Eds),
*NIPS 2015*, Curran Associates, 1855–1863. - Antonucci, A., de Campos, C. P., Huber, D., Zaffalon, M. (2015). Approximate credal network updating by linear programming with applications to decision making.
*International Journal of Approximate Reasoning***58**, 25–38. - Benavoli, A., Mangili, F., Corani, G., Zaffalon, M. (2015). A Bayesian nonparametric procedure for comparing algorithms. In Bach, F. R., Bleim, D. M. (Eds),
*ICML 2015*. PMLR**37**, JMLR.org, 1264–1272. - Benavoli, A., Mangili, F., Ruggeri, F., Zaffalon, M. (2015). Imprecise Dirichlet process with application to the hypothesis test on the probability that
*X*≤*Y*.*Journal of Statistical Theory and Practice***9**(3), 658–684. - Benavoli, A., Zaffalon, M. (2015). Prior near ignorance for inferences in the
*k*-parameter exponential family.*Statistics***49**(5), 1104–1140. - Corani, G., Benavoli, A., Mangili, F., Zaffalon, M. (2015). Bayesian hypothesis testing in machine learning. In Bifet, A., May, M., Zadrozny, B., Gavaldà, R., Pedreschi, D., Bonchi, F., Cardoso, J. S., Spiliopoulou, M. (Eds),
*ECML PKDD 2015*. Lecture notes in Computer Science**9286**, Springer, 199–202. - Mangili, F., Benavoli, A., de Campos C. P., Zaffalon, M. (2015). Reliable survival analysis based on the Dirichlet process.
*Biometrical Journal***57**, 1002–1019. - Miranda, E., Zaffalon, M. (2015). Conformity and independence with coherent lower previsions. In Augustin, T., Doria, S., Miranda, E., Quaeghebeur, E. (Eds),
*ISIPTA '15*. SIPTA, 197–206. - Miranda, E., Zaffalon, M. (2015). Independent products in infinite spaces.
*Journal of Mathematical Analysis and Applications***425**, 460–488. - Miranda, E., Zaffalon, M. (2015). On the problem of computing the conglomerable natural extension.
*International Journal of Approximate Reasoning***56**(A), 1–27. - Antonucci, A., de Campos, C. P., Zaffalon, M. (2014). Probabilistic graphical models. In Augustin, T., Coolen, F., de Cooman, G., Troffaes, M. (Eds),
*Introduction to Imprecise Probabilities*, Wiley, chapter 9. - Benavoli, A., Mangili, F., Corani, G., Zaffalon, M., Ruggeri, F. (2014). A Bayesian Wilcoxon signed-rank test based on the Dirichlet process. In
*ICML 2014*. PMLR**32**, JMLR.org, 1026–1034. - De Campos, C. P., Cuccu, M., Corani, G., Zaffalon, M. (2014). Extended tree augmented naive classifier. In van der Gaag, L., Feelders, A. (Eds),
*PGM 2014*. Lecture Notes in Computer Science**8754**, Springer, 176–189. - Corani, G., Abellán, J., Masegosa, A., Moral, S., Zaffalon, M. (2014). Classification. In Augustin, T., Coolen, F., de Cooman, G., Troffaes, M. (Eds),
*Introduction to Imprecise Probabilities*, Wiley, chapter 10. - Scanagatta, M., de Campos, C. P., Zaffalon, M. (2014). Min-BDeu and max-BDeu scores for learning Bayesian networks. In van der Gaag, L., Feelders, A. (Eds),
*PGM 2014*. Lecture Notes in Computer Science**8754**, Springer, 426–441. - Zaffalon, M., Corani, G. (2014). Comments on "Imprecise probability models for learning multinomial distributions from data. Applications to learning credal networks" by Andrés R. Masegosa and Serafín Moral.
*International Journal of Approximate Reasoning***55**(7), 1579–1600. - Antonucci, A., de Campos, C. P., Huber, D., Zaffalon, M. (2013). Approximating credal network inferences by linear programming. In van der Gaag, L. C. (Ed.),
*ECSQARU 2013*. Lecture Notes in Computer Science**7958**, Springer, 13–25.**Best paper award**. - Antonucci, A., Huber, D., Zaffalon, M., Luginbühl, P., Chapman, I., Ladouceur, R. (2013). CREDO: a military decision-support system based on credal networks. In
*FUSION 2013*. - Benavoli, A., Zaffalon, M. (2013). Density-ratio robustness in dynamic state estimation.
*Mechanical Systems and Signal Processing***37**, 54–75. - De Campos, C. P., Rancoita, P. M. V., Kwee, I., Zucca, E., Zaffalon, M., Bertoni, F. (2013). Discovering subgroups of patients from DNA copy number data using NMF on compacted matrices.
*PLoS ONE***8**(11), e79720. - Mauá, D. D., de Campos, C. P., Zaffalon, M. (2013). On the complexity of solving polytree-shaped limited memory influence diagrams with binary variables.
*Artificial Intelligence***205**, 30–38. - Miranda, E., Zaffalon, M. (2013). Computing the conglomerable natural extension. In Cozman, F., Denoeux, T., Destercke, S., Seidenfeld, T. (Eds),
*ISIPTA '13*. SIPTA, 255–264. - Miranda, E., Zaffalon, M. (2013). Conglomerable coherence.
*International Journal of Approximate Reasoning***54**(9), 1322–1350. - Miranda, E., Zaffalon, M. (2013). Conglomerable coherent lower previsions. In Kruse, R. Berthold, M. R., Moewes, C., Gil, M. A., Grzegorzewski, P., Hryniewicz, O. (Eds), SMPS 2012.
*Synergies of Soft Computing and Statistics for Intelligent Data Analysis*, Advances in Intelligent and Soft Computing**190**, Springer Berlin Heidelberg, 419–427. - Zaffalon, M., Miranda, E. (2013). Probability and time.
*Artificial Intelligence***198**, 1–51. - Benavoli, A., Zaffalon, M. (2012). A model of prior ignorance for inferences in the one-parameter exponential family.
*Journal of Statistical Planning and Inference***142**(7), 1960–1979. - Corani, G., Antonucci, A., Zaffalon, M. (2012). Bayesian networks with imprecise probabilities: theory and application to classification. In Holmes, D. E., Jain, L. C. (Eds),
*Data Mining: Foundations and Intelligent Paradigms (Volume 1: Clustering, Association and Classification)*. Springer-Verlag, Berlin, 49–93. - Mauá, D. D., de Campos, C. P., Zaffalon, M. (2012). Solving limited memory influence diagrams.
*Journal of Artificial Intelligence Research***44**, 97–140. - Mauá, D. D., de Campos, C. P., Zaffalon, M. (2012). The complexity of approximately solving influence diagrams. In de Freitas, N., Murphy, K. P. (Eds),
*UAI 2012*, 604–613. - Mauá, D. D., de Campos, C. P., Zaffalon, M. (2012). Updating credal networks is approximable in polynomial time.
*International Journal of Approximate Reasoning***53**(8), 1183–1199. - Miranda, E., Zaffalon, M., de Cooman, G. (2012). Conglomerable natural extension.
*International Journal of Approximate Reasoning***53**(8), 1200–1227. - Zaffalon, M., Corani, G., Mauá, D. D. (2012). Evaluating credal classifiers by utility-discounted predictive accuracy.
*International Journal of Approximate Reasoning***53**(8), 1282–1301. - Benavoli, A., Zaffalon, M. (2011). A discussion on learning and prior ignorance for sets of priors in
the one-parameter exponential family. In Coolen, F., de Cooman, G., Fetz, T., Oberguggenberger, M. (Eds),
*ISIPTA '11*. SIPTA, 69–78. - Benavoli, A., Zaffalon, M., Miranda, E. (2011). Robust filtering through coherent lower previsions.
*IEEE Transactions on Automatic Control***56**(7), 1567–1581. - De Cooman, G., Miranda, E., Zaffalon, M. (2011). Independent natural extension.
*Artificial Intelligence***175**, 1911–1950. - Mauá, D. D., de Campos, C. P., Zaffalon, M. (2011). A fully polynomial time approximation scheme for updating credal networks of bounded treewidth and number of variable states. In Coolen, F., de Cooman, G., Fetz, T., Oberguggenberger, M. (Eds),
*ISIPTA '11*. SIPTA, 277–286. - Miranda, E., Zaffalon, M., de Cooman, G. (2011). Conglomerable natural extension. In Coolen, F., de Cooman, G., Fetz, T., Oberguggenberger, M. (Eds),
*ISIPTA '11*. SIPTA, 287–296. - Zaffalon, M., Corani, G., Mauá, D. D. (2011). Utility-based accuracy measures to empirically evaluate credal classifiers. In Coolen, F., de Cooman, G., Fetz, T., Oberguggenberger, M. (Eds),
*ISIPTA '11*. SIPTA, 401–410. - Antonucci, A., Sun, Y., de Campos, C. P., Zaffalon, M. (2010). Generalized loopy 2U: a new algorithm for approximate inference in credal networks.
*International Journal of Approximate Reasoning***51**(5), 474–484. - De Cooman, G., Hermans, F., Antonucci, A., Zaffalon, M. (2010). Epistemic irrelevance in credal nets: the case of imprecise Markov trees.
*International Journal of Approximate Reasoning***51**(9), 1029–1052. - De Cooman, G., Miranda, E., Zaffalon, M. (2010). Factorisation properties of the strong product. In Borgelt, C., González Rodrìguez, G., Trutschnig, W., Asunción Lubiano, M., Gil, M. A., Grzegorzewski, P., Hryniewicz, O. (Eds), SMPS 2010,
*Combining Soft Computing and Statistical Methods in Data Analysis*, Advances in Intelligent and Soft Computing**77**, 139–147. - De Cooman, G., Miranda, E., Zaffalon, M. (2010). Independent natural extension. In Hüllermeier, E., Kruse, R., Hoffmann, F. (Eds), IPMU 2010,
*Computational Intelligence for Knowledge-Based Systems Design*. Lecture Notes in Computer Science**6178**, Springer, 737–746. - Miranda, E., Zaffalon, M. (2010). Conditional models: coherence and inference through sequences of joint mass functions.
*Journal of Statistical Planning and Inference***140**(7), 1805–1833. - Miranda, E., Zaffalon, M. (2010). Notes on desirability and conditional lower previsions.
*Annals of Mathematics and Artificial Intelligence.***60**(3–4), 251–309. - Pelessoni, R., Vicig, P., Zaffalon, M. (2010). Inference and risk measurement with the pari-mutuel model.
*International Journal of Approximate Reasoning***51**(9), 1145–1158. - Piatti, A., Antonucci, A., Zaffalon, M. (2010). Building knowledge-based systems by credal networks: a tutorial. In Baswell, A. R. (Ed),
*Advances in Mathematics Research***11**, Nova Science Publishers, New York. - Antonucci, A., Benavoli, A., Zaffalon, M., de Cooman, G., Hermans, F. (2009). Multiple model tracking by imprecise Markov trees. In
*FUSION 2009*. IEEE, 1767–1774. - Antonucci, A., Brühlmann, R., Piatti, A., Zaffalon, M. (2009). Credal networks for military identification problems.
*International Journal of Approximate Reasoning***50**, 666–679. - Benavoli, A., Zaffalon, M., Miranda, E. (2009). Reliable hidden Markov model filtering through coherent lower previsions. In
*FUSION 2009*. IEEE, 1743–1750. - Corani, G., Zaffalon, M. (2009). Lazy naive credal classifier. In
*U'09: Proceedings of the First ACM SIGKDD International Workshop on Knowledge Discovery from Uncertain Data*. ACM, 30–37. - De Cooman, G., Hermans, F., Antonucci, A., Zaffalon, M. (2009). Epistemic irrelevance in credal networks: the case of imprecise Markov trees. In Augustin, T., Coolen, F., Moral, S., Troffaes, M. C. M. (Eds),
*ISIPTA '09*. SIPTA, 149–158. - Miranda, E., Zaffalon, M. (2009). Coherence graphs.
*Artificial Intelligence***173**, 104–144. - Miranda, E., Zaffalon, M. (2009). Natural extension as a limit of regular extensions. In Augustin, T., Coolen, F., Moral, S., Troffaes, M. C. M. (Eds),
*ISIPTA '09*. SIPTA, 327–336. - Pelessoni, R., Vicig, P., Zaffalon, M. (2009). The pari-mutuel model. In Augustin, T., Coolen, F., Moral, S., Troffaes, M. C. M. (Eds),
*ISIPTA '09*. SIPTA, 347–356. - Piatti, A., Zaffalon, M., Trojani, F., Hutter, M. (2009). Limits of learning about a categorical latent variable under prior near-ignorance.
*International Journal of Approximate Reasoning***50**, 597–611. - Zaffalon, M., Miranda, E. (2009). Conservative inference rule for uncertain reasoning under incompleteness.
*Journal of Artificial Intelligence Research***34**, 757–821. - Antonucci, A., Zaffalon, M. (2008). Decision-theoretic specification of credal networks: a unified language for uncertain modeling with sets of Bayesian networks.
*International Journal of Approximate Reasoning***49**(2), 345–361. - Antonucci, A., Zaffalon, M., Sun, Y., de Campos, C. P. (2008). Generalized loopy 2U: a new algorithm for approximate inference in credal networks. In Jaeger, M., Nielsen, T. D. (Eds),
*PGM 2008*. Hirtshals (Denmark), 17–24. - Corani, G., Zaffalon, M. (2008). JNCC2, the implementation of naive credal classifier 2.
*Journal of Machine Learning Research***9**, 2695–2698. - Corani, G., Zaffalon, M. (2008). Credal model averaging: an extension of Bayesian model averaging to imprecise probabilities. In Daelemans, W., Goethals, B., Morik, K. (Eds),
*ECML PKDD 2008*. Lecture notes in Computer Science**5211**, Springer, 257–271. - Corani, G., Zaffalon, M. (2008). Learning reliable classifiers from small or incomplete data sets: the naive credal classifier 2.
*Journal of Machine Learning Research***9**, 581–621. - Corani, G., Zaffalon, M. (2008). Naive credal classifier 2: an extension of naive Bayes for delivering robust classifications. In Stahlbock, R., Crone, S. F., Lessmann, S. (Eds),
*DMIN'08*. CSREA Press, 84–90. - Salvetti, A., Antonucci, A., Zaffalon, M. (2008). Spatially distributed identification of debris flow source areas by credal networks. In Sànchez-Marrè, M., Béjar, J., Comas, J., Rizzoli, A. E., Guariso, G. (Eds),
*iEMSs 2008*, iEMSs, 380–387. - Antonucci, A., Brühlmann, R., Piatti, A., Zaffalon, M. (2007). Credal networks for military identification problems. In de Cooman, G., Vejnarová, J., Zaffalon, M. (Eds),
*ISIPTA '07*. Action M Agency, Prague (Czech Republic), 1–10. - Antonucci, A., Piatti, A., Zaffalon, M. (2007). Credal networks for operational risk measurement and management. In Apolloni, B., Howlett, R. J., Jain, L. C.,
*KES2007*, Lectures Notes in Computer Science**4693**, Springer, 604–611. - Antonucci, A., Salvetti, A., Zaffalon, M. (2007). Credal networks for hazard assessment of debris flows. In Kropp, J., Scheffran, J. (Eds),
*Advanced Methods for Decision Making and Risk Management in Sustainability Science*. Nova Science Publishers, New York. - Antonucci, A., Zaffalon, M. (2007). Fast
algorithms for robust classification with Bayesian nets.
*International Journal of Approximate Reasoning***44**(3), 200–223. - Miranda, E., Zaffalon, M. (2007). Coherence graphs. In de Cooman, G., Vejnarová, J., Zaffalon, M. (Eds),
*ISIPTA '07*. Action M Agency, Prague (Czech Republic), 297–306. - Piatti, A., Zaffalon, M., Trojani, F., Hutter, M. (2007). Learning about a categorical latent variable under prior near-ignorance. In de Cooman, G., Vejnarová, J., Zaffalon, M. (Eds),
*ISIPTA '07*. Action M Agency, Prague (Czech Republic), 357–364. - Vicig, P., Zaffalon, M., Cozman F. G. (2007). Notes on "Notes on conditional previsions".
*International Journal of Approximate Reasoning***44**(3), 358–365. - Antonucci, A., Zaffalon, M. (2006). Equivalence between Bayesian and credal nets on an updating problem. In Lawry, J., Miranda, E., Bugarin, A., Li, S., Gil, M. A., Grzegorzewski, P., Hryniewicz,
O. (Eds), SMPS 2006,
*Soft Methods for Integrated Uncertainty Modelling*, - Antonucci, A., Zaffalon, M. (2006). Locally specified credal networks. In Studený, M., Vomlel, J. (Eds),
*PGM 2006*, Action M Agency, Prague (Czech Republic), 25–34. - Antonucci, A., Zaffalon, M., Ide, J. S., Cozman, F. G. (2006). Binarization algorithms for approximate updating in credal nets. In Penserini, L., Peppas, P., Perini, A. (Eds),
*STAIRS'06*. IOS Press, Amsterdam (Netherlands), 120–131. - Corani, G., Edgar, C., Marshall, I., Wesnes, K., Zaffalon, M. (2006). Classification of dementia types from cognitive profiles data. In
*ECML 2006*. Lecture Notes in Computer Science**4213**, Springer, Berlin, 470–477. - Antonucci, A., Zaffalon, M. (2005). Fast algorithms for robust classification with Bayesian nets. In Cozman, F. G., Nau, R., Seidenfeld, T. (Eds),
*ISIPTA '05*. SIPTA, 11–20. - Hutter, M., Zaffalon, M. (2005). Distribution of mutual information from complete and incomplete data.
*Computational Statistics and Data Analysis***48**(3), 633–657. - Piatti, A., Zaffalon, M., Trojani F. (2005). Limits of learning from imperfect observations under prior ignorance: the case of the imprecise Dirichlet model. In Cozman, F. G., Nau, R., Seidenfeld,
T. (Eds),
*ISIPTA '05*. SIPTA, 276–286. - Zaffalon, M. (2005). Conservative rules for predictive inference with incomplete data. In Cozman, F. G., Nau, R., Seidenfeld, T. (Eds),
*ISIPTA '05*. SIPTA, 406–415. - Zaffalon, M. (2005). Credible classification for environmental problems.
*Environmental Modelling & Software***20**(8), 1003–1012. - Zaffalon, M., Hutter, M. (2005). Robust inference of trees.
*Annals of Mathematics and Artificial Intelligence***45**(1–2), 215–239. - Antonucci, A., Salvetti, A., Zaffalon, M. (2004). Assessing debris flow hazard by credal nets. In López-Díaz, M., Gil, M. A., Grzegorzewski, P., Hryniewicz, O., Lawry, J. (Eds), SMPS 2004,
*Soft Methodology and Random Information Systems*, Springer - Antonucci, A., Salvetti, A., Zaffalon, M. (2004). Hazard assessment of debris flows by credal networks. In Pahl-Wostl, C., Schmidt, S., Rizzoli, A. E., Jakeman, A. J. (Eds),
*iEMSs 2004*, iEMSs 98–103. - De Cooman, G., Zaffalon, M. (2004). Updating beliefs with incomplete observations.
*Artificial Intelligence***159**(1–2), 75–125. - De Cooman, G., Zaffalon, M. (2003). Updating with incomplete observations. In Kjærulff, U., Meek, C. (Eds),
*UAI 2003*. Morgan Kaufmann, 142–150. - Hutter, M., Zaffalon, M. (2003). Bayesian treatment of incomplete discrete data applied to mutual information and feature selection. In Günter, A., Kruse, R., Neumann, B.,
*KI-2003*. Lecture Notes in Computer Science**2821**, Springer, Heidelberg, 396–406. - Zaffalon, M., Fagiuoli, E. (2003). Tree-based credal networks for classification.
*Reliable computing***9**(6), 487–509. - Zaffalon, M., Wesnes, K., Petrini, O. (2003). Reliable diagnoses of dementia by the naive credal classifier inferred from incomplete cognitive data.
*Artificial Intelligence in Medicine***29**(1–2), 61–79. - Zaffalon, M. (2002). Exact credal treatment of missing data.
*Journal of Statistical Planning and Inference***105**(1), 105–122. - Zaffalon, M. (2002). The naive credal classifier.
*Journal of Statistical Planning and Inference***105**(1), 5–21. - Zaffalon, M., Hutter, M. (2002). Robust feature selection by mutual information distributions. In Darwiche, A., Friedman, N. (Eds),
*UAI 2002*. Morgan Kaufmann, 577–584. - Gambardella, L. M., Mastrolilli, M., Rizzoli, A. E., Zaffalon, M. (2001). An optimization methodology for intermodal terminal management.
*Journal of Intelligent Manufacturing***12**, 521–534. - Zaffalon, M. (2001). Robust discovery of tree-dependency structures. In de Cooman, G., Fine, T., Seidenfeld, T. (Eds),
*ISIPTA '01*. Shaker Publishing, The Netherlands, 394–403. - Zaffalon, M. (2001). Statistical inference of the naive credal classifier. In de Cooman, G., Fine, T., Seidenfeld, T. (Eds),
*ISIPTA '01*. Shaker Publishing, The Netherlands, 384–393. - Zaffalon, M., Wesnes, K., Petrini, O. (2001). Credal classification for dementia screening. In Quaglini, S., Barahona P., Andreassen, S. (Eds),
*AIME '01*. Lecture Notes in Computer Science**2101**, Springer, 67–76. - Fagiuoli, E., Zaffalon, M. (2000). Tree-augmented naive credal classifiers.
*IPMU 2000*. Universidad Politécnica de Madrid, Spain, 1320–1327. - Zaffalon, M. (1999). A credal approach to naive classification.
*ISIPTA '99*, The Imprecise Probabilities Project, Universiteit Gent, Belgium, 405–414. - Fagiuoli, E., Zaffalon, M. (1998). 2U: an exact interval propagation algorithm for polytrees with binary variables.
*Artificial Intelligence***106**(1), 77–107. - Fagiuoli, E., Zaffalon, M. (1998). A note about redundancy in influence diagrams.
*International Journal of Approximate Reasoning***19**(3–4), 231–246. - Gambardella, L. M., Rizzoli, A. E., Zaffalon, M. (1998). Simulation and planning of an intermodal container terminal.
*Simulation***71**(2), 107–116. - Darby-Dowman, K., Little, J., Mitra, G., Zaffalon, M. (1997). Constraint logic programming and integer programming approaches and their collaboration in solving an assignment scheduling problem.
*Constraints***1**(3), 245–264.

- De Cooman, G., Vejnarová, J., Zaffalon, M. (Eds) (2007),
*ISIPTA '07,*Action M Agency, Prague, Czech Republic.

- Bernard, J.-M., Seidenfeld, T., Zaffalon, M. (Eds) (2003),
*ISIPTA '03,*Proceedings in Informatics 18, Carleton Scientific, Canada.

- De Cooman, G., Zaffalon, M. (Eds) (2005),
*Annals of Mathematics and Artificial Intelligence***45**(1–2), special issue: selected papers from ISIPTA '01. - Zaffalon, M. (Ed) (2005),
*International Journal of Approximate Reasoning***39**(2–3), special issue: selected papers from ISIPTA '03.

**On the limitations of zero-shot classification of causal relations by LLMs
**Authors: Vani Kanjirangat, Alessandro Antonucci and Marco Zaffalon.

Year: 2024.

Abstract: We aim to explore and analyze the capabilities and limitations of the large language models in understanding and distinguishing causal sentences under a zero-shot setting. We experiment on a multi-class dataset of direct causal, conditional causal, and correlational sentences. In the experiments, the GPT and Falcon models are validated against a fine-tuned BERT model under different settings to explore zero-shot capabilities in causality detection. Zero-shot approaches exhibit good performance in other classification tasks, such as sentiment analysis or question answering. Yet, for this task, the fine-tuned approach seems superior, and the situation does not change if language cues are added or a few-shot setting is considered. This is a preliminary analysis of a work in progress. Still, the results suggest that identifying causal relations is a particularly challenging task that is hard to address in a zero-shot setup.

Published in Campos, R., Jorge, A. M., Jatowt, A., Bhatia, S., Litvak, M. (Eds),

A version similar to the published paper can be downloaded.

**Zero-shot causal graph extrapolation from text via LLMs
**Authors: Alessandro Antonucci, Gregorio Piqué and Marco Zaffalon.

Year: 2024.

Abstract: We evaluate the ability of large language models (LLMs) to infer causal relations from natural language. Compared to traditional natural language processing and deep learning techniques, LLMs show competitive performance in a benchmark of pairwise relations without needing (explicit) training samples. This motivates us to extend our approach to extrapolating causal graphs through iterated pairwise queries. We perform a preliminary analysis on a benchmark of biomedical abstracts with ground-truth causal graphs validated by experts. The results are promising and support the adoption of LLMs for such a crucial step in causal inference, especially in medical domains, where the amount of scientific text to analyse might be huge, and the causal statements are often implicit.

Published at

A version similar to the published paper can be downloaded.

**Tractable bounding of counterfactual queries by knowledge compilation
**Authors: David Huber, Yi Chen, Alessandro Antonucci, Adnan Darwiche and Marco Zaffalon.

Year: 2023.

Abstract: We discuss the problem of bounding partially identifiable queries, such as counterfactuals, in Pearlian structural causal models. A recently proposed iterated EM scheme yields an inner approximation of those bounds by sampling the initialisation parameters. Such a method requires multiple (Bayesian network) queries over models sharing the same structural equations and topology, but different exogenous probabilities. This setup makes a compilation of the underlying model to an arithmetic circuit advantageous, thus inducing a sizeable inferential speed-up. We show how a single symbolic knowledge compilation allows us to obtain the circuit structure with symbolic parameters to be replaced by their actual values when computing the different queries. We also discuss parallelisation techniques to further speed up the bound computation. Experiments against standard Bayesian network inference show clear computational advantages with up to an order of magnitude of speed-up.

Published at

A version similar to the published paper can be downloaded.

**Efficient computation of counterfactual bounds
**Authors: Marco Zaffalon, Alessandro Antonucci, Rafael Cabañas, David Huber abd Dario Azzimonti.

Year: 2024.

Abstract: We assume to be given structural equations over discrete variables inducing a directed acyclic graph, namely, a structural causal model, together with data about its internal nodes. The question we want to answer is how we can compute bounds for partially identifiable counterfactual queries from such an input. We start by giving a map from structural casual models to credal networks. This allows us to compute exact counterfactual bounds via algorithms for credal nets on a subclass of structural causal models. Exact computation is going to be inefficient in general given that, as we show, causal inference is NP-hard even on polytrees. We target then approximate bounds via a causal EM scheme. We evaluate their accuracy by providing credible intervals on the quality of the approximation; we show through a synthetic benchmark that the EM scheme delivers accurate results in a fair number of runs. In the course of the discussion, we also point out what seems to be a neglected limitation to the trending idea that counterfactual bounds can be computed without knowledge of the structural equations. We also present a real case study on palliative care to show how our algorithms can readily be used for practical purposes.

Published in

A version similar to the published paper can be downloaded.

**Approximating counterfactual bounds while fusing observational, biased and randomised data sources
**Authors: Marco Zaffalon, Alessandro Antonucci, Rafael Cabañas and David Huber.

Year: 2023.

Abstract: We address the problem of integrating data from multiple, possibly biased, observational and interventional studies, to eventually compute counterfactuals in structural causal models. We start from the case of a single observational dataset affected by a selection bias. We show that the likelihood of the available data has no local maxima. This enables us to use the causal expectation-maximisation scheme to approximate the bounds for partially identifiable counterfactual queries, which are the focus of this paper. We then show how the same approach can address the general case of multiple datasets, no matter whether interventional or observational, biased or unbiased, by remapping it into the former one via graphical transformations. Systematic numerical experiments and a case study on palliative care show the effectiveness of our approach, while hinting at the benefits of fusing heterogeneous data sources to get informative outcomes in case of partial identifiability.

Published in

A version similar to the published paper can be downloaded.

**Solving the Allais paradox by counterfactual harm
**Authors: Marco Zaffalon, Alessandro Antonucci and Oleg Szher.

Year: 2023.

Abstract: We show that by framing Allais paradox as a causal problem, without loss of generality, we can solve the paradox using the recently proposed notion of counterfactual harm.

Published in Quaeghebeur, E., Miranda, E., Montes, I., Vantaggi, B. (Eds),

A version similar to the published paper can be downloaded.

**Closure operators, classifiers and desirability
**Authors: Alessio Benavoli, Alessandro Facchini and Marco Zaffalon.

Year: 2023.

Abstract: At the core of Bayesian probability theory, or dually desirability theory, lies an assumption of linearity of the scale in which rewards are measured. We revisit two recent papers that extended desirability theory to the nonlinear case by letting the utility scale be represented either by a general closure operator or by a binary general (nonlinear) classifier. By using standard results in logic, we highlight the connection between these two approaches and show that this connection allows us to extend the separating hyperplane theorem (which is at the core of the duality between Bayesian decision theory and desirability theory) to the nonlinear case.

Published in Quaeghebeur, E., Miranda, E., Montes, I., Vantaggi, B. (Eds),

A version similar to the published paper can be downloaded.

**Correlated product of experts for sparse Gaussian process regression
**Authors: Manuel Schürch, Dario Azzimonti, Alessio Benavoli and Marco Zaffalon.

Year: 2023.

Abstract: Gaussian processes (GPs) are an important tool in machine learning and statistics. However, off-the-shelf GP inference procedures are limited to datasets with several thousand data points because of their cubic computational complexity. For this reason, many sparse GPs techniques have been developed over the past years. In this paper, we focus on GP regression tasks and propose a new approach based on aggregating predictions from several local and correlated experts. Thereby, the degree of correlation between the experts can vary between independent up to fully correlated experts. The individual predictions of the experts are aggregated taking into account their correlation resulting in consistent uncertainty estimates. Our method recovers independent Product of Experts, sparse GP and full GP in the limiting cases. The presented framework can deal with a general kernel function and multiple variables, and has a time and space complexity which is linear in the number of experts and data samples, which makes our approach highly scalable. We demonstrate superior performance, in a time vs. accuracy sense, of our proposed method against state-of-the-art GP approximations for synthetic as well as several real-world datasets with deterministic and stochastic optimization.

Published in

A version similar to the published paper can be downloaded.

**Nonlinear desirability theory
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2023.

Abstract: Desirability can be understood as an extension of Anscombe and Aumann's Bayesian decision theory to sets of expected utilities. At the core of desirability lies an assumption of linearity of the scale in which rewards are measured. It is a traditional assumption used to derive the expected utility model, which clashes with a general representation of rational decision making, though. Allais has, in particular, pointed this out in 1953 with his famous paradox. We note that the utility scale plays the role of a closure operator when we regard desirability as a logical theory. This observation enables us to extend desirability to the nonlinear case by letting the utility scale be represented via a general closure operator. The new theory directly expresses rewards in actual nonlinear currency (money), much in Savage's spirit, while arguably weakening the founding assumptions to a minimum. We characterise the main properties of the new theory both from the perspective of sets of gambles and of their lower and upper prices (previsions). We show how Allais paradox finds a solution in the new theory, and discuss the role of sets of probabilities in the theory.

Published in

A version similar to the published paper can be downloaded.

**Bounding counterfactuals under selection bias
**Authors: Marco Zaffalon, Alessandro Antonucci, Rafael Cabañas, David Huber and Dario Azzimonti.

Year: 2022.

Abstract: Causal analysis may be affected by selection bias, which is defined as the systematic exclusion of data from a certain subpopulation. Previous work in this area focused on the derivation of identifiability conditions. We propose instead a first algorithm to address both identifiable and unidentifiable queries. We prove that, in spite of the missingness induced by the selection bias, the likelihood of the available data is unimodal. This enables us to use the causal expectation-maximisation scheme to obtain the values of causal queries in the identifiable case, and to compute bounds otherwise. Experiments demonstrate the approach to be practically viable. Theoretical convergence characterisations are provided.

Published in Salmerón, A., Rumí, R. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Approximating counterfactual bounds while fusing observational, biased and randomised data sources.

**Quantum indistinguishability through exchangeability
**Authors: Alessio Benavoli, Alessandro Facchini and Marco Zaffalon.

Year: 2022.

Abstract: Two particles are identical if all their intrinsic properties, such as spin and charge, are the same, meaning that no quantum experiment can distinguish them. In addition to the well known principles of quantum mechanics, understanding systems of identical particles requires a new postulate, the so called

Published in

A version similar to the published paper can be downloaded.

**Nonlinear desirability as a linear classification problem
**Authors: Arianna Casanova, Alessio Benavoli and Marco Zaffalon.

Year: 2023.

Abstract: This paper presents an interpretation as

Published in

A version similar to the published paper can be downloaded.

**Information algebras in the theory of imprecise probabilities, an extension
**Authors: Arianna Casanova, Jürg Kohlas and Marco Zaffalon.

Year: 2022.

Abstract: In recent works, we have shown how to construct an information algebra of coherent sets of gambles, considering firstly a particular model to represent questions, called the

Published in

A version similar to the published paper can be downloaded.

**Information algebras in the theory of imprecise probabilities
**Authors: Arianna Casanova, Jürg Kohlas and Marco Zaffalon.

Year: 2022.

Abstract: In this paper we create a bridge between desirability and information algebras: we show how coherent sets of gambles, as well as coherent lower previsions, induce such structures. This allows us to enforce the view of such imprecise-probability objects as algebraic and logical structures; moreover, it enforces the interpretation of probability as information, and gives tools to manipulate them as such.

Published in

A version similar to the published paper can be downloaded.

**Causal expectation-maximisation
**Authors: Marco Zaffalon, Alessandro Antonucci and Rafael Cabañas.

Year: 2021.

Abstract: Structural causal models are the basic modelling unit in Pearl's causal theory; in principle they allow us to solve counterfactuals, which are at the top rung of the ladder of causation. But they often contain latent variables that limit their application to special settings. This appears to be a consequence of the fact, proven in this paper, that causal inference is NP-hard even in models characterised by polytree-shaped graphs. To deal with such a hardness, we introduce the causal EM algorithm. Its primary aim is to reconstruct the uncertainty about the latent variables from data about categorical manifest variables. Counterfactual inference is then addressed via standard algorithms for Bayesian networks. The result is a general method to approximately compute counterfactuals, be they identifiable or not (in which case we deliver bounds). We show empirically, as well as by deriving credible intervals, that the approximation we provide becomes accurate in a fair number of EM runs. These results lead us finally to argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations.

Published at

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Efficient computation of counterfactual bounds.

**The weirdness theorem and the origin of quantum paradoxes
**Authors: Alessio Benavoli, Alessandro Facchini and Marco Zaffalon.

Year: 2021.

Abstract: We argue that there is a simple, unique, reason for all quantum paradoxes, and that such a reason is not uniquely related to quantum theory. It is rather a mathematical question that arises at the intersection of logic, probability, and computation. We give our 'weirdness theorem' that characterises the conditions under which the weirdness will show up. It shows that whenever logic has bounds due to the algorithmic nature of its tasks, then weirdness arises in the special form of negative probabilities or non-classical evaluation functionals. Weirdness is not logical inconsistency, however. It is only the expression of the clash between an unbounded and a bounded view of computation in logic. We discuss the implication of these results for quantum mechanics, arguing in particular that its interpretation should ultimately be computational rather than exclusively physical. We develop in addition a probabilistic theory in the real numbers that exhibits the phenomenon of entanglement, thus concretely showing that the latter is not specific to quantum mechanics.

Published in

A version similar to the published paper can be downloaded.

**Time series forecasting with Gaussian processes needs priors
**Authors: Giorgio Corani, Alessio Benavoli and Marco Zaffalon.

Year: 2021.

Abstract: Automatic forecasting is the task of receiving a time series and returning a forecast for the next time steps without any human intervention. Gaussian Processes (GPs) are a powerful tool for modeling time series, but so far there are no competitive approaches for automatic forecasting based on GPs. We propose practical solutions to two problems: automatic selection of the optimal kernel and reliable estimation of the hyperparameters. We propose a fixed composition of kernels, which contains the components needed to model most time series: linear trend, periodic patterns, and other flexible kernel for modeling the non-linear trend. Not all components are necessary to model each time series; during training the unnecessary components are automatically made irrelevant via automatic relevance determination (ARD). We moreover assign priors to the hyperparameters, in order to keep the inference within a plausible range; we design such priors through an empirical Bayes approach. We present results on many time series of different types; our GP model is more accurate than state-of-the-art time series models. Thanks to the priors, a single restart is enough the estimate the hyperparameters; hence the model is also fast to train.

Published in

A version similar to the published paper can be downloaded.

**Algebras of sets and coherent sets of gambles
**Authors: Arianna Casanova, Jürg Kohlas and Marco Zaffalon.

Year: 2021.

Abstract: In a recent work we have shown how to construct an information algebra of coherent sets of gambles defined on general possibility spaces. Here we analyze the connection of such an algebra with the

Published in Vejnarová, J., Wilson, N. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Information algebras in the theory of imprecise probabilities, an extension.

**Quantum indistinguishability through exchangeable desirable gambles
**Authors: Alessio Benavoli, Alessandro Facchini and Marco Zaffalon.

Year: 2021.

Abstract: Two particles are identical if all their intrinsic properties, such as spin and charge, are the same, meaning that no quantum experiment can distinguish them. In addition to the well-known principles of quantum mechanics, understanding systems of identical particles requires a new postulate, the so-called

Published in de Bock, J., Cano, A., Miranda, E., Moral, S. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Quantum indistinguishability through exchangeability.

**Nonlinear desirability as a linear classification problem
**Authors: Arianna Casanova, Alessio Benavoli and Marco Zaffalon.

Year: 2021.

Abstract: The present paper proposes a generalization of linearity axioms of coherence through a geometrical approach, which leads to an alternative interpretation of desirability as a

Published in de Bock, J., Cano, A., Miranda, E., Moral, S. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Nonlinear desirability as a linear classification problem.

**Information algebras of coherent sets of gambles in general possibility spaces
**Authors: Jüerg Kohlas, Arianna Casanova and Marco Zaffalon.

Year: 2021.

Abstract: The present paper proposes a generalization of linearity axioms of coherence through a geometrical approach, which leads to an alternative interpretation of desirability as a

Published in de Bock, J., Cano, A., Miranda, E., Moral, S. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Information algebras in the theory of imprecise probabilities, an extension.

**The sure thing
**Authors: Marco Zaffalon and Enrique Miranda.

Year: 2021.

Abstract: If we prefer action

Published in de Bock, J., Cano, A., Miranda, E., Moral, S. (Eds),

A version similar to the published paper can be downloaded.

**Joint desirability foundations of social choice and opinion pooling
**Authors: Arianna Casanova, Enrique Miranda and Marco Zaffalon.

Year: 2021.

Abstract: We develop joint foundations for the fields of social choice and opinion pooling using coherent sets of desirable gambles, a general uncertainty model that allows to encompass both complete and incomplete preferences. This leads on the one hand to a new perspective of traditional results of social choice (in particular Arrow's theorem as well as sufficient conditions for the existence of an oligarchy and democracy) and on the other hand to using the same framework to analyse opinion pooling. In particular, we argue that weak Pareto (unanimity) should be given the status of a rationality requirement and use this to discuss the aggregation of experts' opinions based on probability and (state-independent) utility, showing some inherent limitation of this framework, with implications for statistics. The connection between our results and earlier work in the literature is also discussed.

Published in

A version similar to the published paper can be downloaded.

**Impact on place of death in cancer patients: a causal exploration in southern Switzerland
**Authors: Heidi Kern, Giorgio Corani, David Huber, Nicola Vermes, Marco Zaffalon, Marco Varini, Claudia Wenzel and André Fringer.

Year: 2020.

Abstract: Most terminally ill cancer patients prefer to die at home, but a majority die in institutional settings. Research questions about this discrepancy have not been fully answered. This study applies artificial intelligence and machine learning techniques to explore the complex network of factors and the cause-effect relationships affecting the place of death, with the ultimate aim of developing policies favouring home-based end-of-life care.

Published in

The published paper can be downloaded.

**CREDICI: a Java library for causal inference by credal networks
**Authors: Rafael Cabañas, Alessandro Antonucci, David Huber and Marco Zaffalon.

Year: 2020.

Abstract: We present CREDICI, a Java open-source tool for causal inference based on credal networks. Credal networks are an extension of Bayesian networks where local probability mass functions are only constrained to belong to given, so-called credal, sets. CREDICI is based on the recent work of Zaffalon et al. (2020), where an equivalence between Pearl's structural causal models and credal networks has been derived. This allows to reduce a counterfactual query in a causal model to a standard query in a credal network, even in the case of unidentifiable causal effects. The necessary transformations and data structures are implemented in CREDICI, while inferences are eventually computed by CREMA (Huber et al., 2020), a twin library for general credal network inference. Here we discuss the main implementation challenges and possible outlooks.

Published in Jaeger, M., Nielsen, T. D. (Eds),

A version similar to the published paper can be downloaded.

**CREMA: a Java library for credal network inference
**Authors: David Huber, Rafael Cabañas, Alessandro Antonucci and Marco Zaffalon.

Year: 2020.

Abstract: We present CREMA (credal models algorithms), a Java library for inference in credal networks. These models are analogous to Bayesian networks, but their local parameters are only constrained to vary in, so-called

Published in Jaeger, M., Nielsen, T. D. (Eds),

A version similar to the published paper can be downloaded.

**Structural causal models are (solvable by) credal networks
**Authors: Marco Zaffalon, Alessandro Antonucci and Rafael Cabañas.

Year: 2020.

Abstract: A structural causal model is made of endogenous (manifest) and exogenous (latent) variables. We show that endogenous observations induce linear constraints on the probabilities of the exogenous variables. This allows to

Published in Jaeger, M., Nielsen, T. D. (Eds),

A version similar to the published paper can be downloaded.

**Recursive estimation for sparse Gaussian process regression
**Authors: Manuel Schürch, Dario Azzimonti, Alessio Benavoli and Marco Zaffalon.

Year: 2020.

Abstract: Gaussian Processes (GPs) are powerful kernelized methods for non-parameteric regression used in many applications. However, their use is limited to a few thousand of training samples due to their cubic time complexity. In order to scale GPs to larger datasets, several sparse approximations based on so-called inducing points have been proposed in the literature. In this work we investigate the connection between a general class of sparse inducing point GP regression methods and Bayesian recursive estimation which enables Kalman filter like updating for online learning. The majority of previous work has focused on the batch setting, in particular for learning the model parameters and the position of the inducing points, here instead we focus on training with mini-batches. By exploiting the Kalman filter formulation, we propose a novel approach that estimates such parameters by recursively propagating the analytical gradients of the posterior over mini-batches of the data. Compared to state-of-the-art methods, our method keeps analytic updates for the mean and covariance of the posterior, thus reducing drastically the size of the optimization problem. We show that our method achieves faster convergence and superior performance compared to state of the art sequential Gaussian Process regression on synthetic GP as well as real-world data with up to a million of data samples.

Published in

A version similar to the published paper can be downloaded.

**Probabilistic reconciliation of hierarchical forecast via Bayes' rule
**Authors: Giorgio Corani, Dario Azzimoni, João Augusto and Marco Zaffalon.

Year: 2020.

Abstract: We present a novel approach for reconciling hierarchical forecasts, based on Bayes' rule. We define a prior distribution for the bottom time series of the hierarchy, based on the bottom base forecasts. Then we update their distribution via Bayes' rule, based on the base forecasts for the upper time series. Under the Gaussian assumption, we derive the updating in closed-form. We derive two algorithms, which differ as for the assumed independencies. We discuss their relation with the MinT reconciliation algorithm and with the Kalman filter, and we compare them experimentally.

Published in Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (Eds),

A version similar to the published paper can be downloaded.

**Compatibility, desirability, and the running intersection property
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2020.

Abstract:

Published in

A version similar to the published paper can be downloaded.

**Social pooling of beliefs and values with desirability
**Authors: Arianna Casanova, Enrique Miranda and Marco Zaffalon.

Year: 2020.

Abstract: The problem of aggregating beliefs and values of rational subjects is treated with the formalism of sets of desirable gambles. This leads on the one hand to a new perspective of traditional results of social choice (in particular Arrow's theorem as well as sufficient conditions for the existence of an oligarchy and democracy) and on the other hand to use the same framework to create connections with opinion pooling. In particular, we show that weak Pareto can be derived as a coherence requirement and discuss the aggregation of state independent beliefs.

Published in

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Joint desirability foundations of social choice and opinion pooling.

**Sampling subgraphs with guaranteed treewidth for accurate and efficient graphical inference
**Authors: Jaemin Yoo, U Kang, Mauro Scanagatta, Giorgio Corani and Marco Zaffalon.

Year: 2020.

Abstract: How can we run graphical inference on large graphs efficiently and accurately? Many real-world networks are modeled as graphical models, and graphical inference is fundamental to understand the properties of those networks. In this work, we propose a novel approach for fast and accurate inference, which first samples a small sub-graph and then runs inference over the subgraph instead of the given graph. This is done by the bounded tree-width (BTW) sampling, our novel algorithm that generates a subgraph with guaranteed bounded tree-width while retaining as many edges as possible. We first analyze the properties of BTW theoretically. Then, we evaluate our approach on node classification and compare it with the baseline which is to run loopy belief propagation (LBP) on the original graph. Our approach can be coupled with various inference algorithms: it shows higher accuracy up to 13.7% with the junction tree algorithm, and allows faster inference up to 23.8 times with LBP. We further compare BTW with previous graph sampling algorithms and show that it gives the best accuracy.

Published in Caverlee, J., Hu, X., Lalmas, M., Wang, W. (Eds),

A version similar to the published paper can be downloaded.

**Bernstein's socks, polynomial-time provable coherence and entanglement
**Authors: Alessio Benavoli, Alessandro Facchini and Marco Zaffalon.

Year: 2019.

Abstract: We recently introduced a bounded rationality approach for the theory of desirable gambles. It is based on the unique requirement that being nonnegative for a gamble has to be defined so that it can be provable in polynomial time. In this paper we continue to investigate properties of this class of models. In particular we verify that the space of Bernstein polynomials in which nonnegativity is specified by the Krivine-Vasilescu certificate is yet another instance of this theory. As a consequence, we show how it is possible to construct in it a thought experiment uncovering entanglement with classical (hence non quantum) coins.

Published in de Bock, J., de Campos, C., de Cooman, G., Quaeghebeur, E, Wheeler, G. (Eds),

A version similar to the published paper can be downloaded.

**Hierarchical estimation of parameters in Bayesian networks
**Authors: Laura Azzimonti, Giorgio Corani and Marco Zaffalon.

Year: 2019.

Abstract: A novel approach for parameter estimation in Bayesian networks is presented. The main idea is to introduce a hyper-prior in the Multinomial-Dirichlet model, traditionally used for conditional distribution estimation in Bayesian networks. The resulting hierarchical model jointly estimates different conditional distributions belonging to the same conditional probability table, thus

Published in

A version similar to the published paper can be downloaded.

**Sum-of-squares for bounded rationality
**Authors: Alessio Benavoli, Alessandro Facchini, Dario Piga and Marco Zaffalon.

Year: 2019.

Abstract: In the gambling foundation of probability theory, rationality requires that a subject should always (never) find desirable all nonnegative (negative) gambles, because no matter the result of the experiment the subject never (always) decreases her money. Evaluating the nonnegativity of a gamble in infinite spaces is a difficult task. In fact, even if we restrict the gambles to be polynomials in ℝ

Published in

A version similar to the published paper can be downloaded.

**Desirability foundations of robust rational decision making
**Authors: Marco Zaffalon and Enrique Miranda.

Year: 2021 (published online in 2018).

Abstract: Recent work has formally linked the traditional axiomatisation of incomplete preferences à la Anscombe-Aumann with the theory of desirability developed in the context of imprecise probability, by showing in particular that they are the very same theory. The equivalence has been established under the constraint that the set of possible prizes is finite. In this paper, we relax such a constraint, thus de facto creating one of the most general theories of rationality and decision making available today. We provide the theory with a sound interpretation and with basic notions, and results, for the separation of beliefs and values, and for the case of complete preferences. Moreover, we discuss the role of conglomerability for the presented theory, arguing that it should be a rationality requirement under very broad conditions.

Published in

A version similar to the published paper can be downloaded.

**Compatibility, coherence and the RIP
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2019.

Abstract: We generalise the classical result on the compatibility of marginal, possible non-disjoint, assessments in terms of the running intersection property to the imprecise case, where our beliefs are modelled in terms of sets of desirable gambles. We consider the case where we have unconditional and conditional assessments, and show that the problem can be simplified via a tree decomposition.

Published in Destercke, S., Denoeux, T., Gil, M. A., Grzegorzewski, P., Hryniewicz, O. (Eds), SMPS 2018,

A version similar to the published paper can be downloaded.

**Entropy-based pruning for learning Bayesian networks using BIC
**Authors: Cassio P. de Campos, Mauro Scanagatta, Giorgio Corani and Marco Zaffalon.

Year: 2018.

Abstract: For decomposable score-based structure learning of Bayesian networks, existing approaches first compute a collection of candidate parent sets for each variable and then optimize over this collection by choosing one parent set for each variable without creating directed cycles while maximizing the total score. We target the task of constructing the collection of candidate parent sets when the score of choice is the Bayesian Information Criterion (BIC). We provide new non-trivial results that can be used to prune the search space of candidate parent sets of each node. We analyze how these new results relate to previous ideas in the literature both theoretically and empirically. We show in experiments with UCI data sets that gains can be significant. Since the new pruning rules are easy to implement and have low computational costs, they can be promptly integrated into all state-of-the-art methods for structure learning of Bayesian networks.

Published in

A version similar to the published paper can be downloaded.

**Approximate structure learning for large Bayesian networks
**Authors: Mauro Scanagatta, Giorgio Corani, Cassio P. de Campos and Marco Zaffalon.

Year: 2018.

Abstract: Learning a Bayesian networks with bounded treewidth is important for reducing the complexity of the inferences. We present a novel anytime algorithm (k-MAX) method for this task, which scales up to thousands of variables. Through extensive experiments we show that it consistently yields higher-scoring structures than its competitors on complete data sets. We then consider the problem of structure learning from incomplete data sets. This can be addressed by structural EM, which however is computationally very demanding. We thus adopt the novel k-MAX algorithm in the maximization step of structural EM, obtaining an efficient computation of the expected sufficient statistics. We test the resulting structural EM method on the task of imputing missing data, comparing it against the state-of-the-art approach based on random forests. Our approach achieves the same imputation accuracy of the competitors, but in about one tenth of the time. Furthermore we show that it has worst-case complexity linear in the input size, and that it is easily parallelizable.

Published in

A version similar to the published paper can be downloaded.

**Efficient learning of bounded-treewidth Bayesian networks from complete and incomplete data sets
**Authors: Mauro Scanagatta, Giorgio Corani, Marco Zaffalon, Jaemin Yoo and U Kang.

Year: 2018.

Abstract: Learning a Bayesian networks with bounded treewidth is important for reducing the complexity of the inferences. We present a novel anytime algorithm (k-MAX) method for this task, which scales up to thousands of variables. Through extensive experiments we show that it consistently yields higher-scoring structures than its competitors on complete data sets. We then consider the problem of structure learning from incomplete data sets. This can be addressed by structural EM, which however is computationally very demanding. We thus adopt the novel k-MAX algorithm in the maximization step of structural EM, obtaining an efficient computation of the expected sufficient statistics. We test the resulting structural EM method on the task of imputing missing data, comparing it against the state-of-the-art approach based on random forests. Our approach achieves the same imputation accuracy of the competitors, but in about one tenth of the time. Furthermore we show that it has worst-case complexity linear in the input size, and that it is easily parallelizable.

Published in

A version similar to the published paper can be downloaded.

**Axiomatising incomplete preferences through sets of desirable gambles
**Authors: Marco Zaffalon and Enrique Miranda.

Year: 2017.

Abstract: We establish the equivalence of two very general theories: the first is the decision-theoretic formalisation of incomplete preferences based on the mixture independence axiom; the second is the theory of coherent sets of desirable gambles (bounded variables) developed in the context of imprecise probability and extended here to vector-valued gambles. Such an equivalence allows us to analyse the theory of incomplete preferences from the point of view of desirability. Among other things, this leads us to uncover an unexpected and clarifying relation: that the notion of

Published in

The published paper can be downloaded.

**Hierarchical multinomial-Dirichlet model for the estimation of conditional probability tables
**Authors: Laura Azzimonti, Giorgio Corani and Marco Zaffalon.

Year: 2017.

Abstract: We present a novel approach for estimating conditional probability tables, based on a joint, rather than independent, estimate of the conditional distributions belonging to the same table. We derive exact analytical expressions for the estimators and we analyse their properties both analytically and via simulation. We then apply this method to the estimation of parameters in a Bayesian network. Given the structure of the network, the proposed approach better estimates the joint distribution and significantly improves the classification performance with respect to traditional approaches.

Published in Raghavan, V., Aluru, S., Karypis, G., Miele, L., Wu, X. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Hierarchical estimation of parameters in Bayesian networks.

**Improved local search in Bayesian networks structure learning
**Authors: Mauro Scanagatta, Giorgio Corani and Marco Zaffalon.

Year: 2017.

Abstract: We present a novel approach for score-based structure learning of Bayesian network, which couples an existing ordering-based algorithm for structure optimization with a novel operator for exploring the neighborhood of a given order in the space of the orderings. Our approach achieves state-of-the-art performances in data sets containing thousands of variables.

Published in Hyttinen, A., Malone, B., Suzuki, J. (Eds),

A version similar to the published paper can be downloaded.

**Time for a change: a tutorial for comparing multiple classifiers through Bayesian analysis
**Authors: Alessio Benavoli, Giorgio Corani, Janez Demšar and Marco Zaffalon.

Year: 2017.

Abstract: The machine learning community adopted the use of null hypothesis significance testing (NHST) in order to ensure the statistical validity of results. Many scientific fields however realized the shortcomings of frequentist reasoning and in the most radical cases even banned its use in publications. We should do the same: just as we have embraced the Bayesian paradigm in the development of new machine learning methods, so we should also use it in the analysis of our own results. We argue for abandonment of NHST by exposing its fallacies and, more importantly, offer better—more sound and useful—alternatives for it.

Published in

The published paper can be downloaded.

**SOS for bounded rationality
**Authors: Alessio Benavoli, Alessandro Facchini, Dario Piga and Marco Zaffalon.

Year: 2017.

Abstract: In the gambling foundation of probability theory, rationality requires that a subject should always (never) find desirable all nonnegative (negative) gambles, because no matter the result of the experiment the subject never (always) decreases her money. Evaluating the nonnegativity of a gamble in infinite spaces is a difficult task. In fact, even if we restrict the gambles to be polynomials in

Published in Antonucci, A., Corani, G., Couso, I., Destercke, S. (Eds),

A version similar to the published paper can be downloaded.

**A polarity theory for sets of desirable gambles
**Authors: Alessio Benavoli, Alessandro Facchini, Marco Zaffalon and José Vicente-Pérez.

Year: 2017.

Abstract: Coherent sets of almost desirable gambles and credal sets are known to be equivalent models. That is, there exists a bijection between the two collections of sets preserving the usual operations, e.g. conditioning. Such a correspondence is based on the polarity theory for closed convex cones. Learning from this simple observation, in this paper we introduce a new (lexicographic) polarity theory for general convex cones and then we apply it in order to establish an analogous correspondence between coherent sets of desirable gambles and convex sets of lexicographic probabilities.

Published in Antonucci, A., Corani, G., Couso, I., Destercke, S. (Eds),

A version similar to the published paper can be downloaded.

**Statistical comparison of classifiers through Bayesian hierarchical modelling
**Authors: Giorgio Corani, Alessio Benavoli, Janez Demšar, Francesca Mangili and Marco Zaffalon.

Year: 2017.

Abstract: Usually one compares the accuracy of two competing classifiers using null hypothesis significance tests. Yet such tests suffer from important shortcomings, which can be overcome by switching to Bayesian hypothesis testing. We propose a Bayesian hierarchical model that jointly analyzes the cross-validation results obtained by two classifiers on multiple data sets. The model estimates more accurately the difference between classifiers on the individual data sets than the traditional approach of averaging, independently on each data set, the cross-validation results. It does so by jointly analyzing the results obtained on all data sets, and applying shrinkage to the estimates. The model eventually returns the posterior probability of the accuracies of the two classifiers being practically equivalent or significantly different.

Published in

A version similar to the published paper can be downloaded.

**A Gleason-type theorem for any dimension based on a gambling formulation of quantum mechanics
**Authors: Alessio Benavoli, Alessandro Facchini and Marco Zaffalon.

Year: 2017.

Abstract: Based on a gambling formulation of quantum mechanics, we derive a Gleason-type theorem that holds for any dimension

Published in

A version similar to the published paper can be downloaded.

**Full conglomerability
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2017.

Abstract: We do a thorough mathematical study of the notion of full conglomerability, that is, conglomerability with respect to all the partitions of an infinite possibility space, in the sense considered by Peter Walley (1991). We consider both the cases of precise and imprecise probability (sets of probabilities). We establish relations between conglomerability and countable additivity, continuity, super-additivity and marginal extension. Moreover, we discuss the special case where a model is conglomerable with respect to a subset of all the partitions, and try to sort out the different notions of conglomerability present in the literature. We conclude that countable additivity, which is routinely used to impose full conglomerability in the precise case, appears to be the most well-behaved way to do so in the imprecise case as well by taking envelopes of countably additive probabilities. Moreover, we characterise these envelopes by means of a number of necessary and sufficient conditions.

Published in

A version similar to the published paper can be downloaded.

**Quantum rational preferences and desirability
**Authors: Alessio Benavoli, Alessandro Facchini and Marco Zaffalon.

Year: 2016.

Abstract: We develop a theory of quantum rational decision making in the tradition of Anscombe and Aumann's axiomatisation of preferences on horse lotteries. It is essentially the Bayesian decision theory generalised to the space of Hermitian matrices. Among other things, this leads us to give a representation theorem showing that quantum complete rational preferences are obtained by means of expected utility considerations.

Published in

A version similar to the published paper can be downloaded.

**Learning treewidth-bounded Bayesian networks with thousands of variables
**Authors: Mauro Scanagatta, Giorgio Corani, Cassio P. de Campos and Marco Zaffalon.

Year: 2016.

Abstract: We present a method for learning treewidth-bounded Bayesian networks from data sets containing thousands of variables. Bounding the treewidth of a Bayesian network greatly reduces the complexity of inferences. Yet, being a global property of the graph, it considerably increases the difficulty of the learning process. Our novel algorithm accomplishes this task, scaling both to large domains and to large treewidths. Our novel approach consistently outperforms the state of the art on experiments with up to thousands of variables.

Published in Lee, D. D., Sugiyama, M., Luxburg, U. V., Guyon, I., Garnett, R. (Eds),

A version similar to the published paper can be downloaded.

**Quantum mechanics: the Bayesian theory generalised to the space of Hermitian matrices
**Authors: Alessio Benavoli, Alessandro Facchini and Marco Zaffalon.

Year: 2016.

Abstract: We consider the problem of gambling on a quantum experiment and enforce rational behaviour by a few rules. These rules yield, in the classical case, the Bayesian theory of probability via duality theorems. In our quantum setting, they yield the Bayesian theory generalised to the space of Hermitian matrices. This very theory is quantum mechanics: in fact, we derive all its four postulates from the generalised Bayesian theory. This implies that quantum mechanics is self-consistent. It also leads us to reinterpret the main operations in quantum mechanics as probability rules: Bayes' rule (measurement), marginalisation (partial tracing), independence (tensor product). To say it with a slogan, we obtain that quantum mechanics is the Bayesian theory in the complex numbers.

Published in

A version similar to the published paper can be downloaded.

**Full conglomerability, continuity and marginal extension
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2017.

Abstract: We investigate fully conglomerable coherent lower previsions in the sense of Walley, and some particular cases of interest: envelopes of fully conglomerable linear previsions, envelopes of countably additive linear previsions and fully disintegrable linear previsions. We study the connections with continuity and countable super-additivity, and show that full conglomerability can be characterised in terms of a supremum of marginal extension models.

Published in Ferraro, M. B., Giordani, P., Vantaggi, B., Gagolewski, M., Gil, M. A., Grzegorzewski, P., Hryniewicz, O. (Eds), SMPS 2016,

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Full conglomerability.

**Conformity and independence with coherent lower previsions
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2016.

Abstract: We define the conformity of marginal and conditional models with a joint model within Walley's theory of coherent lower previsions. Loosely speaking, conformity means that the joint can reproduce the marginal and conditional models we started from. By studying conformity with and without additional assumptions of epistemic irrelevance and independence, we establish connections with a number of prominent models in Walley's theory: the marginal extension, the irrelevant natural extension, the independent natural extension and the strong product.

Published in

A version similar to the published paper can be downloaded.

**Learning Bayesian networks with thousands of variables
**Authors: Mauro Scanagatta, Cassio P. de Campos, Giorgio Corani and Marco Zaffalon.

Year: 2015.

Abstract: We present a method for learning Bayesian networks from data sets containing thousands of variables without the need for structure constraints. Our approach is made of two parts. The first is a novel algorithm that effectively explores the space of possible parent sets of a node. It guides the exploration towards the most promising parent sets on the basis of an approximated score function that is computed in constant time. The second part is an improvement of an existing ordering-based algorithm for structure optimization. The new algorithm provably achieves a higher score compared to its original formulation. Our novel approach consistently outperforms the state of the art on very large data sets.

Published in Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M., Garnett, R. (Eds),

A version similar to the published paper can be downloaded.

**Bayesian hypothesis testing in machine learning
**Authors: Giorgio Corani, Alessio Benavoli, Francesca Mangili and Marco Zaffalon.

Year: 2015.

Abstract: Most hypothesis testing in machine learning is done using the frequentist null-hypothesis significance test, which has severe drawbacks. We review recent Bayesian tests which overcome the drawbacks of the frequentist ones.

Published in Bifet, A., May, M., Zadrozny, B., Gavaldà, R., Pedreschi, D., Bonchi, F., Cardoso, J. S., Spiliopoulou, M. (Eds),

A version similar to the published paper can be downloaded.

**Conformity and independence with coherent lower previsions
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2015.

Abstract: We study the conformity of marginal unconditional and conditional models with a joint model under assumptions of epistemic irrelevance and independence, within Walley's theory of coherent lower previsions. By doing so, we make a link with a number of prominent models within this theory: the marginal extension, the irrelevant natural extension, the independent natural extension and the strong product.

Published in Augustin, T., Doria, S., Miranda, E., Quaeghebeur, E. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Conformity and independence with coherent lower previsions.

**Reliable survival analysis based on the Dirichlet process
**Authors: Francesca Mangili, Alessio Benavoli, Cassio Polpo de Campos and Marco Zaffalon.

Year: 2015.

Abstract: We present a robust Dirichlet process for estimating survival functions from samples with right-censored data. It adopts a prior near-ignorance approach to avoid almost any assumption about the distribution of the population lifetimes, as well as the need of eliciting an infinite dimensional parameter (in case of lack of prior information), as it happens with the usual Dirichlet process prior. We show how such model can be used to derive robust inferences from right-censored lifetime data. Robustness is due to the identification of the decisions that are prior-dependent, and can be interpreted as an analysis of sensitivity with respect to the hypothetical inclusion of fictitious new samples in the data. In particular, we derive a nonparametric estimator of the survival probability and a hypothesis test about the probability that the lifetime of an individual from one population is shorter than the lifetime of an individual from another. We evaluate these ideas on simulated data and on the Australian AIDS survival dataset. The methods are publicly available through an easy-to-use R package.

Published in

A version similar to the published paper can be downloaded.

**A Bayesian nonparametric procedure for comparing algorithms
**Authors: Alessio Benavoli, Francesca Mangili, Giorgio Corani and Marco Zaffalon.

Year: 2015.

Abstract: A fundamental task in machine learning is to compare the performance of multiple algorithms. This is usually performed by the frequentist Friedman test followed by multiple comparisons. This implies dealing with the well-known shortcomings of null hypothesis significance tests. We propose a Bayesian approach to overcome these problems. We provide three main contributions. First, we propose a nonparametric Bayesian version of the Friedman test using a Dirichlet process (DP) based prior. We show that, from a Bayesian perspective, the Friedman test is an inference for a multivariate mean based on an ellipsoid inclusion test. Second, we derive a

Published in Bach, F. R., Bleim, D. M. (Eds),

A version similar to the published paper can be downloaded.

**Learning extended tree augmented naive structures
**Authors: Cassio Polpo de Campos, Giorgio Corani, Mauro Scanagatta, Marco Cuccu and Marco Zaffalon.

Year: 2016.

Abstract: This work proposes an extended version of the well-known tree-augmented naive Bayes (TAN) classifier where the structure learning step is performed without requiring features to be connected to the class. Based on a modification of Edmonds' algorithm, our structure learning procedure explores a superset of the structures that are considered by TAN, yet achieves global optimality of the learning score function in a very efficient way (quadratic in the number of features, the same complexity as learning TANs). We enhance our procedure with a new score function that only takes into account arcs that are relevant to predict the class, as well as an optimization over the equivalent sample size during learning. These ideas may be useful for structure learning of Bayesian networks in general. A range of experiments show that we obtain models with better prediction accuracy than Naive Bayes and TAN, and comparable to the accuracy of the state-of-the-art classifier averaged one-dependence estimator (AODE). We release our implementation of ETAN so that it can be easily installed and run within Weka.

Published in

A version similar to the published paper can be downloaded.

**Bayesian network data imputation with application to survival tree analysis
**Authors: Paola Maria Vittoria Rancoita, Marco Zaffalon, Emanuele Zucca, Francesco Bertoni and Cassio Polpo de Campos.

Year: 2016.

Abstract: Retrospective clinical datasets are often characterized by a relatively small sample size and many missing data. In this case, a common way for handling the missingness consists in discarding from the analysis patients with missing covariates, further reducing the sample size. Alternatively, if the mechanism that generated the missing allows, incomplete data can be imputed on the basis of the observed data, avoiding the reduction of the sample size and allowing methods to deal with complete data later on. Moreover, methodologies for data imputation might depend on the particular purpose and might achieve better results by considering specific characteristics of the domain. The problem of missing data treatment is studied in the context of survival tree analysis for the estimation of a prognostic patient stratification. Survival tree methods usually address this problem by using surrogate splits, that is, splitting rules that use other variables yielding similar results to the original ones. Instead, our methodology consists in modeling the dependencies among the clinical variables with a Bayesian network, which is then used to perform data imputation, thus allowing the survival tree to be applied on the completed dataset. The Bayesian network is directly learned from the incomplete data using a structural expectation-maximization (EM) procedure in which the maximization step is performed with an exact anytime method, so that the only source of approximation is due to the EM formulation itself. On both simulated and real data, our proposed methodology usually outperformed several existing methods for data imputation and the imputation so obtained improved the stratification estimated by the survival tree (especially with respect to using surrogate splits).

Published in

A version similar to the published paper can be downloaded.

**Independent products in infinite spaces
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2015.

Abstract: Probabilistic independence, intended as the mutual irrelevance of given variables, can be solidly founded on a notion of self-consistency of an uncertainty model, in particular when probabilities go imprecise. There is nothing in this approach that prevents it from being adopted in very general setups, and yet it has mostly been detailed for variables taking finitely many values. In this mathematical study, we complement previous research by exploring the extent to which such an approach can be generalised. We focus in particular on the independent products of two variables. We characterise the main notions, including some of factorisation and productivity, in the general case where both spaces can be infinite and show that, however, there are situations—even in the case of precise probability—where no independent product exists. This is not the case as soon as at least one space is finite. We study in depth this case at the frontiers of good-behaviour detailing the relations among the most important notions; we show for instance that being an independent product is equivalent to a certain productivity condition. Then we step back to the general case: we give conditions for the existence of independent products and study ways to get around its inherent limitations.

Published in

A version similar to the published paper can be downloaded.

**Approximate credal network updating by linear programming with applications to decision making
**Authors: Alessandro Antonucci, Cassio Polpo de Campos, David Huber and Marco Zaffalon.

Year: 2015.

Abstract: Credal nets are probabilistic graphical models which extend Bayesian nets to cope with sets of distributions. An algorithm for approximate credal network updating is presented. The problem in its general formulation is a multilinear optimization task, which can be linearized by an appropriate rule for fixing all the local models apart from those of a single variable. This simple idea can be iterated and quickly leads to accurate inferences. A transformation is also derived to reduce decision making in credal networks based on the maximality criterion to updating. The decision task is proved to have the same complexity of standard inference, being NP

Published in

A version similar to the published paper can be downloaded.

**Imprecise Dirichlet process with application to the hypothesis test on the probability that X≤Y**Authors: Alessio Benavoli, Francesca Mangili, Fabrizio Ruggeri and Marco Zaffalon.

Year: 2015.

Abstract: The Dirichlet process (DP) is one of the most popular Bayesian nonparametric models. An open problem with the DP is how to choose its infinite-dimensional parameter (base measure) in case of lack of prior information. In this work we present the Imprecise DP (IDP)—a prior near-ignorance DP-based model that does not require any choice of this probability measure. It consists of a class of DPs obtained by letting the normalized base measure of the DP vary in the set of all probability measures. We discuss the tight connections of this approach with Bayesian robustness and in particular prior near-ignorance modeling via sets of probabilities. We use this model to perform a Bayesian hypothesis test on the probability

Published in

A version similar to the published paper can be downloaded.

**Prior near ignorance for inferences in the k-parameter exponential family
**Authors: Alessio Benavoli and Marco Zaffalon.

Year: 2015.

Abstract: This paper proposes a model of prior ignorance about a multivariate variable based on a set of distributions M. In particular, we discuss four minimal properties that a model of prior ignorance should satisfy: invariance, near-ignorance, learning and convergence. Near-ignorance and invariance ensure that our prior model behaves as a vacuous model with respect to some statistical inferences (e.g., mean, credible intervals, etc.) and some transformation of the parameter space. Learning and convergence ensure that our prior model can learn from data and, in particular, that the influence of M on the posterior inferences vanishes with increasing numbers of observations. We show that these four properties can all be satisfied by a set of conjugate priors in the multivariate exponential families if the set M includes finitely additive probabilities obtained as limits of truncated exponential functions. The obtained set M is a model of prior ignorance with respect to the functions (queries) that are commonly used for statistical inferences and, because of conjugacy, it is tractable and easy to elicit. Applications of the model to some practical statistical problems show the effectiveness of the approach.

Published in

A version similar to the published paper can be downloaded.

**On the problem of computing the conglomerable natural extension
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2015.

Abstract: Embedding conglomerability as a rationality requirement in probability was among the aims of Walley's behavioural theory of coherent lower previsions. However, recent work has shown that this attempt has only been partly successful. If we focus in particular on the extension of given assessments to a rational and conglomerable model (in the least-committal way), we have that the procedure used in Walley's theory, the

Published in

A version similar to the published paper can be downloaded.

**Extended tree augmented naive classifier
**Authors: Cassio P. de Campos, Marco Cuccu, Giorgio Corani and Marco Zaffalon.

Year: 2014.

Abstract: This work proposes an extended version of the well-known tree-augmented naive Bayes (TAN) classifier where the structure learning step is performed without requiring features to be connected to the class. Based on a modification of Edmonds' algorithm, our structure learning procedure explores a superset of the structures that are considered by TAN, yet achieves global optimality of the learning score function in a very efficient way (quadratic in the number of features, the same complexity as learning TANs). A range of experiments show that we obtain models with better accuracy than TAN and comparable to the accuracy of the state-of-the-art classifier averaged one-dependence estimator.

Published in van der Gaag, L., Feelders, A. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Learning extended tree augmented naive structures.

**Min-BDeu and max-BDeu scores for learning Bayesian networks
**Authors: Mauro Scanagatta, Cassio P. de Campos and Marco Zaffalon.

Year: 2014.

Abstract: This work presents two new score functions based on the Bayesian Dirichlet equivalent uniform (BDeu) score for learning Bayesian network structures. They consider the sensitivity of BDeu to varying parameters of the Dirichlet prior. The scores take on the most adversary and the most beneficial priors among those within a contamination set around the symmetric one. We build these scores in such way that they are decomposable and can be computed efficiently. Because of that, they can be integrated into any state-of-the-art structure learning method that explores the space of directed acyclic graphs and allows decomposable scores. Empirical results suggest that our scores outperform the standard BDeu score in terms of the likelihood of unseen data and in terms of edge discovery with respect to the true network, at least when the training sample size is small. We discuss the relation between these new scores and the accuracy of inferred models. Moreover, our new criteria can be used to identify the amount of data after which learning is saturated, that is, additional data are of little help to improve the resulting model.

Published in van der Gaag, L., Feelders, A. (Eds),

A version similar to the published paper can be downloaded.

**A Bayesian Wilcoxon signed-rank test based on the Dirichlet process
**Authors: Alessio Benavoli, Francesca Mangili, Giorgio Corani, Marco Zaffalon and Fabrizio Ruggeri.

Year: 2014.

Abstract: Bayesian methods are ubiquitous in machine learning. Nevertheless, the analysis of empirical results is typically performed by frequentist tests. This implies dealing with null hypothesis significance tests and p-values, even though the shortcomings of such methods are well known. We propose a nonparametric Bayesian version of the Wilcoxon signed-rank test using a Dirichlet process (DP) based prior. We address in two different ways the problem of how to choose the infinite dimensional parameter that characterizes the DP. The proposed test has all the traditional strengths of the Bayesian approach; for instance, unlike the frequentist tests, it allows verifying the null hypothesis, not only rejecting it, and taking decisions which minimize the expected loss. Moreover, one of the solutions proposed to model the infinite-dimensional parameter of the DP allows isolating instances in which the traditional frequentist test is guessing at random. We show results dealing with the comparison of two classifiers using real and simulated data.

Published in

A version similar to the published paper can be downloaded.

**Comments on "Imprecise probability models for learning multinomial distributions from data. Applications to learning credal networks" by Andrés R. Masegosa and Serafín Moral
**Authors: Marco Zaffalon and Giorgio Corani.

Year: 2014.

Abstract: We briefly overview the problem of learning probabilities from data using imprecise probability models that express very weak prior beliefs. Then we comment on the new contributions to this question given in the paper by Masegosa and Moral and provide some insights about the performance of their models in data mining experiments of classification.

Published in

A version similar to the published paper can be downloaded.

**Probabilistic graphical models
**Authors: Alessandro Antonucci, Cassio P. de Campos and Marco Zaffalon.

Year: 2014.

Abstract: This report presents probabilistic graphical models that are based on imprecise probabilities using a simplified language. In particular, the discussion is focused on credal networks and discrete domains. It describes the building blocks of credal networks, algorithms to perform inference, and discusses on complexity results and related work. The goal is to have an easy-to-follow introduction to the topic.

Published in Augustin, T., Coolen, F., de Cooman, G., Troffaes, M. (Eds),

A version similar to the published paper can be downloaded.

**Classification
**Authors: Giorgio Corani, Joaquín Abellán, Andrés Masegosa, Serafín Moral and Marco Zaffalon.

Year: 2014.

Abstract: This report presents an introduction to credal classification. The discussion focuses on the naive credal classifier and its extensions as well as on credal classifiers based on classification trees. In addition we discuss the metrics suitable for scoring credal classifiers and present some experiments. The goal is to have an easy-to-follow introduction to the topic.

Published in Augustin, T., Coolen, F., de Cooman, G., Troffaes, M. (Eds),

A version similar to the published paper can be downloaded.

**On the complexity of solving polytree-shaped limited memory influence diagrams with binary variables
**Authors: Denis D. Mauá, Cassio P. de Campos and Marco Zaffalon.

Year: 2013.

Abstract: Influence diagrams are intuitive and concise representations of structured decision problems. When the problem is non-Markovian, an optimal strategy can be exponentially large in the size of the diagram. We can avoid the inherent intractability by constraining the size of admissible strategies, giving rise to limited memory influence diagrams. A valuable question is then how small do strategies need to be to enable efficient optimal planning. Arguably, the smallest strategies one can conceive simply prescribe an action for each time step, without considering past decisions or observations. Previous work has shown that finding such optimal strategies even for polytree-shaped diagrams with ternary variables and a single value node is NP-hard, but the case of binary variables was left open. In this paper we address such a case, by first noting that optimal strategies can be obtained in polynomial time for polytree-shaped diagrams with binary variables and a single value node. We then show that the same problem is NP-hard if the diagram has multiple value nodes. These two results close the fixed-parameter complexity analysis of optimal strategy selection in influence diagrams parametrized by the shape of the diagram, the number of value nodes and the maximum variable cardinality.

Published in

A version similar to the published paper can be downloaded.

**Discovering subgroups of patients from DNA copy number data using NMF on compacted matrices
**Authors: Cassio Polpo de Campos, Paola Maria Vittoria Rancoita, Ivo Kwee, Emanuele Zucca, Marco Zaffalon and Francesco Bertoni.

Year: 2013.

Abstract: In the study of complex genetic diseases, the identification of subgroups of patients sharing similar genetic characteristics represents a challenging task, for example, to improve treatment decision. One type of genetic lesion, frequently investigated in such disorders, is the change of the DNA copy number (CN) at specific genomic traits. Non-negative Matrix Factorization (NMF) is a standard technique to reduce the dimensionality of a data set and to cluster data samples, while keeping its most relevant information in meaningful components. Thus, it can be used to discover subgroups of patients from CN profiles. It is however computationally impractical for very high dimensional data, such as CN microarray data. Deciding the most suitable number of subgroups is also a challenging problem. The aim of this work is to derive a procedure to compact high dimensional data, in order to improve NMF applicability without compromising the quality of the clustering. This is particularly important for analyzing high-resolution microarray data. Many commonly used quality measures, as well as our own measures, are employed to decide the number of subgroups and to assess the quality of the results. Our measures are based on the idea of identifying robust subgroups, inspired by biologically/clinically relevance instead of simply aiming at well-separated clusters. We evaluate our procedure using four real independent data sets. In these data sets, our method was able to find accurate subgroups with individual molecular and clinical features and outperformed the standard NMF in terms of accuracy in the factorization fitness function. Hence, it can be useful for the discovery of subgroups of patients with similar CN profiles in the study of heterogeneous diseases.

Published in

A version similar to the published paper can be downloaded.

**CREDO: a military decision-support system based on credal networks
**Authors: Alessandro Antonucci, David Huber, Marco Zaffalon, Philippe Luginbühl, Ian Chapman and Richard Ladouceur.

Year: 2013.

Abstract: A software tool especially designed for military domains to create and query decision-support systems is presented. Credal networks, which are Bayesian networks whose parameters have the freedom to vary in convex sets, are used to model the relations among the system variables. A novel elicitation procedure of these sets, which allows the military experts to report their knowledge by purely qualitative judgements, is proposed. Two high-level fusion procedures to cope with multiple experts in this framework are also derived. All these features are supported by the software and demonstrated in an application to space security tested during the last NATO multinational experiment.

Published in

A version similar to the published paper can be downloaded.

**Computing the conglomerable natural extension
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2013.

Abstract: Given a coherent lower prevision

Published in Cozman, F., Denoeux, T., Destercke, S., Seidenfeld, T. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is On the problem of computing the conglomerable natural extension.

**Conglomerable coherence
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2013.

Abstract: We contrast Williams' and Walley's theories of coherent lower previsions in the light of conglomerability. These are two of the most credited approaches to a behavioural theory of imprecise probability. Conglomerability is the notion that distinguishes them the most: Williams' theory does not consider it, while Walley aims at embedding it in his theory. This question is important, as conglomerability is a major point of disagreement at the foundations of probability, since it was first defined by de Finetti in 1930. We show that Walley's notion of joint coherence (which is the single axiom of his theory) for conditional lower previsions does not take all the implications of conglomerability into account. Considered also some previous results in the literature, we deduce that Williams' theory should be the one to use when conglomerability is not required; for the opposite case, we define the new theory of conglomerably coherent lower previsions, which is arguably the one to use, and of which Walley's theory can be understood as an approximation. We show that this approximation is exact in two important cases: when all conditioning events have positive lower probability, and when conditioning partitions are nested.

Published in

A version similar to the published paper can be downloaded.

**Approximating credal network inferences by linear programming
**Authors: Alessandro Antonucci, Cassio Polpo de Campos, David Huber and Marco Zaffalon.

Year: 2013.

Abstract: An algorithm for approximate credal network updating is presented. The problem in its general formulation is a multilinear optimization task, which can be linearized by an appropriate rule for fixing all the local models apart from those of a single variable. This simple idea can be iterated and quickly leads to very accurate inferences. The approach can also be specialized to classification with credal networks based on the maximality criterion. A complexity analysis for both the problem and the algorithm is reported together with numerical experiments, which confirm the good performance of the method. While the inner approximation produced by the algorithm gives rise to a classifier which might return a subset of the optimal class set, preliminary empirical results suggest that the accuracy of the optimal class set is seldom affected by the approximate probabilities.

Published in van der Gaag, L. C. (Ed.),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Approximate credal network updating by linear programming with applications to decision making.

**Probability and time
**Authors: Marco Zaffalon and Enrique Miranda.

Year: 2013.

Abstract: Probabilistic reasoning is often attributed a temporal meaning, in which conditioning is regarded as a normative rule to compute future beliefs out of current beliefs and observations. However, the well-established 'updating interpretation' of conditioning is not concerned with beliefs that evolve in time, and in particular with future beliefs. On the other hand, a temporal justification of conditioning was proposed already by De Moivre and Bayes, by requiring that current and future beliefs be consistent. We reconsider the latter approach while dealing with a generalised version of the problem, using a behavioural theory of imprecise probability in the form of coherent lower previsions as well as of coherent sets of desirable gambles, and letting the possibility space be finite or infinite. We obtain that using conditioning is normative, in the imprecise case, only if one establishes future behavioural commitments at the same time of current beliefs. In this case it is also normative that present beliefs be

Published in

A version similar to the published paper can be downloaded.

**Density-ratio robustness in dynamic state estimation
**Authors: Alessio Benavoli and Marco Zaffalon.

Year: 2013.

Abstract: The filtering problem is addressed by taking into account imprecision in the knowledge about the probabilistic relationships involved. Imprecision is modelled in this paper by a particular closed convex set of probabilities that is known with the name of

Published in

A version similar to the published paper can be downloaded.

**The complexity of approximately solving influence diagrams
**Authors: Denis D. Mauá, Cassio P. de Campos and Marco Zaffalon.

Year: 2012.

Abstract: Influence diagrams allow for intuitive and yet precise description of complex situations involving decision making under uncertainty. Unfortunately, most of the problems described by influence diagrams are hard to solve. In this paper we discuss the complexity of approximately solving influence diagrams. We do not assume no-forgetting or regularity, which makes the class of problems we address very broad. Remarkably, we show that when both the treewidth and the cardinality of the variables are bounded the problem admits a fully polynomial-time approximation scheme.

Published in de Freitas, N., Murphy, K. P. (Eds),

A version similar to the published paper can be downloaded.

**Updating credal networks is approximable in polynomial time
**Authors: Denis D. Mauá, Cassio P. de Campos and Marco Zaffalon.

Year: 2012.

Abstract: Credal networks relax the precise probability requirement of Bayesian networks, enabling a richer representation of uncertainty in the form of closed convex sets of probability measures. The increase in expressiveness comes at the expense of higher computational costs. In this paper, we present a new variable elimination algorithm for exactly computing posterior inferences in extensively specified credal networks, which is empirically shown to outperform a state-of-the-art algorithm. The algorithm is then turned into a provably good approximation scheme, that is, a procedure that for any input is guaranteed to return a solution not worse than the optimum by a given factor. Remarkably, we show that when the networks have bounded treewidth and bounded number of states per variable the approximation algorithm runs in time polynomial in the input size and in the inverse of the error factor, thus being the first known fully polynomial-time approximation scheme for inference in credal networks.

Published in

A version similar to the published paper can be downloaded.

**Conglomerable natural extension
**Authors: Enrique Miranda, Marco Zaffalon and Gert de Cooman.

Year: 2012.

Abstract: At the foundations of probability theory lies a question that has been open since de Finetti framed it in 1930: whether or not an uncertainty model should be required to be conglomerable. Conglomerability is related to accepting infinitely many conditional bets. Walley is one of the authors who have argued in favor of conglomerability, while de Finetti rejected the idea. In this paper we study the extension of the conglomerability condition to two types of uncertainty models that are more general than the ones envisaged by de Finetti: sets of desirable gambles and coherent lower previsions. We focus in particular on the weakest (i.e., the least-committal) of those extensions, which we call the

Published in

A version similar to the published paper can be downloaded.

**Evaluating credal classifiers by utility-discounted predictive accuracy
**Authors: Marco Zaffalon, Giorgio Corani and Denis D. Mauá.

Year: 2012.

Abstract: Predictions made by imprecise-probability models are often indeterminate (that is, set-valued). Measuring the quality of an indeterminate prediction by a single number is important to fairly compare different models, but a principled approach to this problem is currently missing. In this paper we derive, from a set of assumptions, a metric to evaluate the predictions of credal classifiers. These are supervised learning models that issue set-valued predictions. The metric turns out to be made of an objective component, and another that is related to the decision-maker's degree of risk aversion to the variability of predictions. We discuss when the measure can be rendered independent of such a degree, and provide insights as to how the comparison of classifiers based on the new measure changes with the number of predictions to be made. Finally, we make extensive empirical tests of credal, as well as precise, classifiers by using the new metric. This shows the practical usefulness of the metric, while yielding a first insightful and extensive comparison of credal classifiers.

Published in

A version similar to the published paper can be downloaded.

**Conglomerable coherent lower previsions
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2013.

Abstract: Walley's theory of coherent lower previsions builds upon the former theory by Williams with the explicit aim to make it deal with conglomerability. We show that such a construction has been only partly successful because Walley's founding axiom of joint coherence does not entirely capture the implications of conglomerability. As a way to fully achieve Walley's original aim, we propose then the new theory of

Published in Kruse, R., Berthold, M. R., Moewes, C., Gil, M. A., Grzegorzewski, P., Hryniewicz, O. (Eds), SMPS 2012.

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Conglomerable coherence.

**Solving limited memory influence diagrams
**Authors: Denis D. Mauá, Cassio P. de Campos and Marco Zaffalon.

Year: 2012.

Abstract: We present a new algorithm for exactly solving decision making problems represented as influence diagrams. We do not require the usual assumptions of no forgetting and regularity; this allows us to solve problems with simultaneous decisions and limited information. The algorithm is empirically shown to outperform a state-of-the-art algorithm on randomly generated problems of up to 150 variables and 10

Published in

The published paper can be downloaded.

**A model of prior ignorance for inferences in the one-parameter exponential family
**Authors: Alessio Benavoli and Marco Zaffalon.

Year: 2012.

Abstract: This paper proposes a model of prior ignorance about a scalar variable based on a set of distributions M. In particular, a set of

Published in

A version similar to the published paper can be downloaded.

**Bayesian networks with imprecise probabilities: theory and application to classification
**Authors: Giorgio Corani, Alessandro Antonucci and Marco Zaffalon.

Year: 2012.

Abstract: Bayesian network are powerful probabilistic graphical models for modelling uncertainty. Among others, classification represents an important application: some of the most used classifiers are based on Bayesian networks. Bayesian networks are precise models: exact numeric values should be provided for quantification. This requirement is sometimes too narrow. Sets instead of single distributions can provide a more realistic description in these cases. Bayesian networks can be generalized to cope with sets of distributions. This leads to a novel class of imprecise probabilistic graphical models, called

Published in Holmes, D. E., Jain, L. C. (Eds),

A version similar to the published paper can be downloaded.

**A discussion on learning and prior ignorance for sets of priors in the one-parameter exponential family
**Authors: Alessio Benavoli and Marco Zaffalon.

Year: 2011.

Abstract: For a conjugate likelihood-prior model in the one-parameter exponential family of distributions, we show that, by letting the parameters of the conjugate exponential prior vary in suitable sets, it is possible to define a set of conjugate priors M that guarantees prior near-ignorance without producing vacuous inferences. This result is obtained following both a behavioural and a sensitivity analysis interpretation of prior near-ignorance. We also discuss the problem of the incompatibility of learning and prior near-ignorance for sets of priors in the one-parameter exponential family of distributions in the case of imperfect observations. In particular, we prove that learning and prior near-ignorance are compatible under an imperfect observation mechanism if and only if the support of the priors in M is the whole real axis.

Published in Coolen, F., de Cooman, G., Fetz, T., Oberguggenberger, M. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is A model of prior ignorance for inferences in the one-parameter exponential family.

**A fully polynomial time approximation scheme for updating credal networks of bounded treewidth and number of variable states
**Authors: Denis D. Mauá, Cassio P. de Campos and Marco Zaffalon.

Year: 2011.

Abstract: Credal networks lift the precise probability assumption of Bayesian networks, enabling a richer representation of uncertainty in the form of closed convex sets of probability measures. The increase in expressiveness comes at the expense of higher computational costs. In this paper we present a new algorithm which is an extension of the well-known variable elimination algorithm for computing posterior inferences in extensively specified credal networks. The algorithm efficiency is empirically shown to outperform a state-of-the-art algorithm. We then provide the first fully polynomial time approximation scheme for inference in credal networks with bounded treewidth and number of states per variable.

Published in Coolen, F., de Cooman, G., Fetz, T., Oberguggenberger, M. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Updating credal networks is approximable in polynomial time.

**Conglomerable natural extension
**Authors: Enrique Miranda, Marco Zaffalon and Gert de Cooman.

Year: 2011.

Abstract: We study the weakest conglomerable model that is implied by desirability or probability assessments: the

Published in Coolen, F., de Cooman, G., Fetz, T., Oberguggenberger, M. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Conglomerable natural extension.

**Utility-based accuracy measures to empirically evaluate credal classifiers
**Authors: Marco Zaffalon, Giorgio Corani and Denis D. Mauá.

Year: 2011.

Abstract: Predictions made by imprecise-probability models are often indeterminate (that is, set-valued). Measuring the quality of an indeterminate prediction by a single number is important to fairly compare different models, but a principled approach to this problem is currently missing. In this paper we derive a measure to evaluate the predictions of credal classifiers from a set of assumptions. The measure turns out to be made of an objective component, and another that is related to the decision-maker's degree of risk-aversion. We discuss when the measure can be rendered independent of such a degree, and provide insights as to how the comparison of classifiers based on the new measure change with the number of predictions to be made. Finally, we empirically study the behavior of the proposed measure.

Published in Coolen, F., de Cooman, G., Fetz, T., Oberguggenberger, M. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Evaluating credal classifiers by utility-discounted predictive accuracy.

**Independent natural extension
**Authors: Gert de Cooman, Enrique Miranda and Marco Zaffalon.

Year: 2011.

Abstract: There is no unique extension of the standard notion of probabilistic independence to the case where probabilities are indeterminate or imprecisely specified.

Published in

A version similar to the published paper can be downloaded.

**Robust filtering through coherent lower previsions
**Authors: Alessio Benavoli, Marco Zaffalon and Enrique Miranda.

Year: 2011.

Abstract: The classical filtering problem is re-examined to take into account imprecision in the knowledge about the probabilistic relationships involved. Imprecision is modeled in this paper by closed convex sets of probabilities. We derive a solution of the state estimation problem under such a framework that is very general: it can deal with any closed convex set of probability distributions used to characterize uncertainty in the prior, likelihood, and state transition models. This is made possible by formulating the theory directly in terms of coherent lower previsions, that is, of the lower envelopes of the expectations obtained from the set of distributions. The general solution is specialized to two particular classes of coherent lower previsions. The first consists of a family of Gaussian distributions whose means are only known to belong to an interval. The second is the so-called linear-vacuous mixture model, which is a family made of convex combinations of a known nominal distribution (e.g., a Gaussian) with arbitrary distributions. For the latter case, we empirically compare the proposed estimator with the Kalman filter. This shows that our solution is more robust to the presence of modelling errors in the system and that, hence, appears to be a more realistic approach than the Kalman filter in such a case.

Published in

A version similar to the published paper can be downloaded.

**Notes on desirability and conditional lower previsions
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2010.

Abstract: We detail the relationship between sets of desirable gambles and conditional lower previsions. The former is one the most general models of uncertainty. The latter corresponds to Walley's celebrated theory of imprecise probability. We consider two avenues: when a collection of conditional lower previsions is derived from a set of desirable gambles, and its converse. In either case, we relate the properties of the derived model with those of the originating one. Our results constitute basic tools to move from one formalism to the other, and thus to take advantage of work done in the two fronts.

Published in

A version similar to the published paper can be downloaded.

**Epistemic irrelevance in credal nets: the case of imprecise Markov trees
**Authors: Gert de Cooman, Filip Hermans, Alessandro Antonucci and Marco Zaffalon.

Year: 2010.

Abstract: We focus on credal nets, which are graphical models that generalise Bayesian nets to imprecise probability. We replace the notion of strong independence commonly used in credal nets with the weaker notion of epistemic irrelevance, which is arguably more suited for a behavioural theory of probability. Focusing on directed trees, we show how to combine the given local uncertainty models in the nodes of the graph into a global model, and we use this to construct and justify an exact message-passing algorithm that computes updated beliefs for a variable in the tree. The algorithm, which is linear in the number of nodes, is formulated entirely in terms of coherent lower previsions, and is shown to satisfy a number of rationality requirements. We supply examples of the algorithm's operation, and report an application to on-line character recognition that illustrates the advantages of our approach for prediction. We comment on the perspectives, opened by the availability, for the first time, of a truly efficient algorithm based on epistemic irrelevance.

Published in

A version similar to the published paper can be downloaded.

**Inference and risk measurement with the pari-mutuel model
**Authors: Renato Pelessoni, Paolo Vicig and Marco Zaffalon.

Year: 2010.

Abstract: We explore generalizations of the pari-mutuel model (PMM), a formalization of an intuitive way of assessing an upper probability from a precise one. We discuss a naive extension of the PMM considered in insurance, compare the PMM with a related model, the Total Variation Model, and generalize the natural extension of the PMM introduced by P. Walley and other pertained formulae. The results are subsequently given a risk measurement interpretation: in particular it is shown that a known risk measure, Tail Value at Risk (TVaR), is derived from the PMM, and a coherent risk measure more general than TVaR from its imprecise version. We analyze further the conditions for coherence of a related risk measure, Conditional Tail Expectation. Conditioning with the PMM is investigated too, computing its natural extension, characterising its dilation and studying the weaker concept of imprecision increase.

Published in

A version similar to the published paper can be downloaded.

**Factorisation properties of the strong product
**Authors: Gert de Cooman, Enrique Miranda and Marco Zaffalon.

Year: 2010.

Abstract: We investigate a number of factorisation conditions in the framework of sets of probability measures, or coherent lower previsions, with finite referential spaces. We show that the so-called strong product constitutes one way to combine a number of marginal coherent lower previsions into an independent joint lower prevision, and we prove that under some conditions it is the only independent product that satisfies the factorisation conditions.

Published in In Borgelt, C., González Rodrìguez, G., Trutschnig, W., Asunción Lubiano, M., Gil, M. A., Grzegorzewski, P., Hryniewicz, O. (Eds), SMPS 2010,

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Independent natural extension.

**Independent natural extension
**Authors: Gert de Cooman, Enrique Miranda and Marco Zaffalon.

Year: 2010.

Abstract: We introduce a general definition for the independence of a number of finite-valued variables, based on coherent lower previsions. Our definition has an epistemic flavour: it arises from personal judgements that a number of variables are irrelevant to one another. We show that a number of already existing notions, such as strong independence, satisfy our definition. Moreover, there always is a least-committal independent model, for which we provide an explicit formula: the

Published in Hüllermeier, E., Kruse, R., Hoffmann, F. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Independent natural extension.

**Conditional models: coherence and inference through sequences of joint mass functions
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2010.

Abstract: We call a

Published in

A version similar to the published paper can be downloaded.

**Generalized loopy 2U: a new algorithm for approximate inference in credal networks
**Authors: Alessandro Antonucci, Yi Sun, Cassio Polpo de Campos and Marco Zaffalon.

Year: 2010.

Abstract: Credal networks generalize Bayesian networks by relaxing the requirement of precision of probabilities. Credal networks are considerably more expressive than Bayesian networks, but this makes belief updating NP-hard even on polytrees. We develop a new efficient algorithm for approximate belief updating in credal networks. The algorithm is based on an important representation result we prove for general credal networks: that any credal network can be equivalently reformulated as a credal network with binary variables; moreover, the transformation, which is considerably more complex than in the Bayesian case, can be implemented in polynomial time. The equivalent binary credal network is then updated by L2U, a loopy approximate algorithm for binary credal networks. Overall, we generalize L2U to non-binary credal networks, obtaining a scalable algorithm for the general case, which is approximate only because of its loopy nature. The accuracy of the inferences with respect to other state-of-the-art algorithms is evaluated by extensive numerical tests.

Published in

A version similar to the published paper can be downloaded.

**Building knowledge-based systems by credal networks: a tutorial
**Authors: Alberto Piatti, Alessandro Antonucci and Marco Zaffalon.

Year: 2010.

Abstract:

Published in Baswell, A. R. (Ed),

A version similar to the published paper can be downloaded.

**Lazy naive credal classifier
**Authors: Giorgio Corani and Marco Zaffalon.

Year: 2009.

Abstract: We propose a local (or lazy) version of the

Published in

A version similar to the published paper can be downloaded.

**Multiple model tracking by imprecise Markov trees
**Authors: Alessandro Antonucci, Alessio Benavoli, Marco Zaffalon, Gert de Cooman and Filip Hermans.

Year: 2009.

Abstract: We present a new procedure for tracking manoeuvring objects by hidden Markov chains. It leads to more reliable modelling of the transitions between hidden states compared to similar approaches proposed within the Bayesian framework: we adopt convex sets of probability mass functions rather than single 'precise probability' specifications, in order to provide a more realistic and cautious model of the manoeuvre dynamics. In general, the downside of such increased freedom in the modelling phase is a higher inferential complexity. However, the simple topology of hidden Markov chains allows for efficient tracking of the object through a recently developed belief propagation algorithm. Furthermore, the imprecise specification of the transitions can produce so-called indecision, meaning that more than one model may be suggested by our method as a possible explanation of the target kinematics. In summary, our approach leads to a multiple-model estimator whose performance, investigated through extensive numerical tests, turns out to be more accurate and robust than that of Bayesian ones.

Published in

A version similar to the published paper can be downloaded.

**Reliable hidden Markov model filtering through coherent lower previsions
**Authors: Alessio Benavoli, Marco Zaffalon and Enrique Miranda.

Year: 2009.

Abstract: We extend Hidden Markov Models for continuous variables taking into account imprecision in our knowledge about the probabilistic relationships involved. To achieve that, we consider sets of probabilities, also called coherent lower previsions. In addition to the general formulation, we study in detail a particular case of interest: linear-vacuous mixtures. We also show, in a practical case, that our extension outperforms the Kalman filter when modelling errors are present in the system.

Published in

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Robust filtering through coherent lower previsions.

**Epistemic irrelevance in credal networks: the case of imprecise Markov trees
**Authors: Gert de Cooman, Filip Hermans, Alessandro Antonucci and Marco Zaffalon.

Year: 2009.

Abstract: We replace strong independence in credal networks with the weaker notion of epistemic irrelevance. Focusing on directed trees, we show how to combine local credal sets into a global model, and we use this to construct and justify an exact message-passing algorithm that computes updated beliefs for a variable in the tree. The algorithm, which is essentially linear in the number of nodes, is formulated entirely in terms of coherent lower previsions. We supply examples of the algorithm's operation, and report an application to on-line character recognition that illustrates the advantages of our model for prediction.

Published in Augustin, T., Coolen, F., Moral, S., Troffaes, M. C. M. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Epistemic irrelevance in credal nets: the case of imprecise Markov trees.

**Natural extension as a limit of regular extensions
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2009.

Abstract: This paper is devoted to the extension of conditional assessments that satisfy some consistency criteria, such as weak or strong coherence, to further domains. In particular, we characterise the natural extension of a number of conditional lower previsions on finite spaces, by showing that it can be calculated as the limit of a sequence of conditional lower previsions defined by regular extension. Our results are valid for conditional lower previsions with non-linear domains, and allow us to give an equivalent formulation of the notion of coherence in terms of credal sets.

Published in Augustin, T., Coolen, F., Moral, S., Troffaes, M. C. M. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Conditional models: coherence and inference through sequences of joint mass functions.

**The pari-mutuel model
**Authors: Renato Pelessoni, Paolo Vicig and Marco Zaffalon.

Year: 2009.

Abstract: We explore generalizations of the pari-mutuel model (PMM), a formalization of an intuitive way of assessing an upper probability from a precise one. We discuss a naive extension of the PMM considered in insurance and generalize the natural extension of the PMM introduced by P. Walley and other related formulae. The results are subsequently given a risk measurement interpretation: in particular it is shown that a known risk measure, Tail Value at Risk (TVaR), is derived from the PMM, and a coherent risk measure more general than TVaR from its imprecise version. We analyze further the conditions for coherence of a related risk measure, Conditional Tail Expectation. Explicit formulae for conditioning the PMM and conditions for dilation or imprecision increase are also supplied and discussed.

Published in Augustin, T., Coolen, F., Moral, S., Troffaes, M. C. M. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Inference and risk measurement with the pari-mutuel model.

**Conservative inference rule for uncertain reasoning under incompleteness
**Authors: Marco Zaffalon and Enrique Miranda.

Year: 2009.

Abstract: In this paper we formulate the problem of inference under incomplete information in very general terms. This includes modelling the process responsible for the incompleteness, which we call the

Published in

The published paper can be downloaded.

**Coherence graphs
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2009.

Abstract: We study the consistency of a number of probability distributions, which are allowed to be imprecise. To make the treatment as general as possible, we represent those probabilistic assessments as a collection of

Published in

A version similar to the published paper can be downloaded.

**JNCC2, the implementation of naive credal classifier 2
**Authors: Giorgio Corani and Marco Zaffalon.

Year: 2008.

Abstract: JNCC2 implements the

Published in

The published paper can be downloaded.

The software that implements the naive credal classifier 2 is freely available at the JNCC2 page.

**Credal networks for military identification problems
**Authors: Alessandro Antonucci, Ralph Brühlmann, Alberto Piatti and Marco Zaffalon.

Year: 2009.

Abstract: Credal networks are imprecise probabilistic graphical models generalizing Bayesian networks to convex sets of probability mass functions. This makes credal networks particularly suited to model expert knowledge under very general conditions, including states of qualitative and incomplete knowledge. In this paper, we present a credal network for risk evaluation in case of intrusion of civil aircrafts into a restricted flight area. The different factors relevant for this evaluation, together with an independence structure over them, are initially identified. These factors are observed by sensors, whose reliabilities can be affected by variable external factors, and even by the behaviour of the intruder. A model of these observation processes, and the necessary fusion scheme for the information returned by the sensors measuring the same factor, are both completely embedded into the structure of the credal network. A pool of experts, facilitated in their task by specific techniques to convert qualitative judgements into imprecise probabilistic assessments, has made possible the quantification of the network. We show the capabilities of the proposed model by means of some preliminary tests referred to simulated scenarios. Overall, we can regard this application as a useful tool to support military experts in their decision, but also as a quite general imprecise-probability paradigm for information fusion.

Published in

A version similar to the published paper can be downloaded.

**Limits of learning about a categorical latent variable under prior near-ignorance
**Authors: Alberto Piatti, Marco Zaffalon, Fabio Trojani and Marcus Hutter.

Year: 2009.

Abstract: In this paper, we consider the coherent theory of (epistemic) uncertainty of Walley, in which beliefs are represented through sets of probability distributions, and we focus on the problem of modeling prior ignorance about a categorical random variable. In this setting, it is a known result that a state of prior ignorance is not compatible with learning. To overcome this problem, another state of beliefs, called

Published in

A version similar to the published paper can be downloaded.

**Decision-theoretic specification of credal networks: a unified language for uncertain modeling with sets of Bayesian networks
**Authors: Alessandro Antonucci and Marco Zaffalon.

Year: 2008.

Abstract:

Published in

A version similar to the published paper can be downloaded.

**Generalized loopy 2U: a new algorithm for approximate inference in credal networks
**Authors: Alessandro Antonucci, Marco Zaffalon, Yi Sun and Cassio Polpo de Campos.

Year: 2008.

Abstract: Credal nets generalize Bayesian nets by relaxing the requirement of precision of probabilities. Credal nets are considerably more expressive than Bayesian nets, but this makes belief updating NP-hard even on polytrees. We develop a new efficient algorithm for approximate belief updating in credal nets. The algorithm is based on an important representation result we prove for general credal nets: that any credal net can be equivalently reformulated as a credal net with binary variables; moreover, the transformation, which is considerably more complex than in the Bayesian case, can be implemented in polynomial time. The equivalent binary credal net is updated by L2U, a loopy approximate algorithm for binary credal nets. Thus, we generalize L2U to non-binary credal nets, obtaining an accurate and scalable algorithm for the general case, which is approximate only because of its loopy nature. The accuracy of the inferences is evaluated by empirical tests.

Published in Jaeger, M., Nielsen, T. D. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Generalized loopy 2U: a new algorithm for approximate inference in credal networks.

**Credal model averaging: an extension of Bayesian model averaging to imprecise probabilities
**Authors: Giorgio Corani and Marco Zaffalon.

Year: 2008.

Abstract: We deal with the arbitrariness in the choice of the prior over the models in

Published in Daelemans, W., Goethals, B., Morik, K. (eds),

A version similar to the published paper can be downloaded.

**Spatially distributed identification of debris flow source areas by credal networks
**Authors: Andrea Salvetti, Alessandro Antonucci and Marco Zaffalon.

Year: 2008.

Abstract: Debris flows represent a very destructive natural hazard, affecting buildings, transport infrastructures, and, very often, causing human losses in mountain regions. That makes the identification of potential source areas of debris flows inside a watershed particularly important. In this paper we present a general identification procedure based on the

Published in Sànchez-Marrè, M., Béjar, J., Comas, J., Rizzoli, A. E., Guariso, G. (Eds),

A version similar to the published paper can be downloaded.

**Learning reliable classifiers from small or incomplete data sets: the naive credal classifier 2
**Authors: Giorgio Corani and Marco Zaffalon.

Year: 2008.

Abstract: In this paper, the naive credal classifier, which is a set-valued counterpart of naive Bayes, is extended to a general and flexible treatment of incomplete data, yielding a new classifier called

Published in

The published paper can be downloaded.

The software that implements the naive credal classifier 2 is described in another paper published in *Journal of Machine Learning Research*: JNCC2, the implementation of naive credal classifier 2, and it is freely available at the JNCC2 page.

**Naive credal classifier 2: an extension of naive Bayes for delivering robust classifications
**Authors: Giorgio Corani and Marco Zaffalon.

Year: 2008.

Abstract: Naive credal classifier 2 (NCC2) extends naive Bayes in order to deliver more robust classifications. NCC2 is based on a set of prior densities rather than on a single prior; as a consequence, when faced with instances whose classification is prior-dependent (and therefore might not be reliable), it returns a set of classes (we call this an

Published in Stahlbock, R., Crone, S. F., Lessmann, S. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Learning reliable classifiers from small or incomplete data sets: the naive credal classifier 2.

**Credal networks for operational risk measurement and management
**Authors: Alessandro Antonucci, Alberto Piatti and Marco Zaffalon.

Year: 2007.

Abstract: According to widely accepted guidelines for self-regulation, the capital requirements of a bank should relate to the level of risk with respect to three different categories. Among them,

Published in Apolloni, B., Howlett, R. J., Jain, L. C.,

A version similar to the published paper can be downloaded.

**Credal networks for military identification problems
**Authors: Alessandro Antonucci, Ralph Brühlmann, Alberto Piatti and Marco Zaffalon.

Year: 2007.

Abstract: Credal networks are imprecise probabilistic graphical models generalizing Bayesian networks to convex sets of probability mass functions. This makes credal networks particularly suited to capture and model expert knowledge under very general conditions, including states of qualitative and incomplete knowledge. In this paper, we present a credal network for risk evaluation in case of intrusion of civil aircrafts into a no-fly zone. The different factors relevant for this evaluation, together with an independence structure over them, are initially identified. These factors are observed by sensors, whose reliabilities can be affected by variable external factors, and even by the behavior of the intruder. A model of these observation mechanisms, and the necessary fusion scheme for the information returned by the sensors measuring the same factor, are both completely embedded into the structure of the credal network. A pool of experts, facilitated in their task by specific techniques to convert qualitative judgments into imprecise probabilistic assessments, has made possible the quantification of the network. We show the capabilities of the proposed network by means of some preliminary tests referred to simulated scenarios. Overall, we can regard this application as an useful tool to support military experts in their decision, but also as a quite general imprecise-probability paradigm for information fusion.

Published in de Cooman, G, Vejnarová, J., Zaffalon, M. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Credal networks for military identification problems.

**Coherence graphs
**Authors: Enrique Miranda and Marco Zaffalon.

Year: 2007.

Abstract: We consider the task of proving Walley's (

Published in de Cooman, G, Vejnarová, J., Zaffalon, M. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Coherence graphs.

**Learning about a categorical latent variable under prior near-ignorance
**Authors: Alberto Piatti, Marco Zaffalon, Fabio Trojani and Marcus Hutter.

Year: 2007.

Abstract: It is well known that complete prior ignorance is not compatible with learning, at least in a coherent theory of (epistemic) uncertainty. What is less widely known, is that there is a state similar to full ignorance, that Walley calls

Published in de Cooman, G, Vejnarová, J., Zaffalon, M. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Limits of learning about a categorical latent variable under prior near-ignorance.

**Notes on "Notes on conditional previsions"
**Authors: Paolo Vicig, Marco Zaffalon and Fabio G. Cozman.

Year: 2007.

Abstract: These notes comment on Williams' fundamental essay

Published in

A version similar to the published paper can be downloaded.

**Fast
algorithms for robust classification with Bayesian nets
**Authors: Alessandro Antonucci and Marco Zaffalon.

Year: 2007.

Abstract: We focus on a well-known classification task with expert systems based on

Published in

A version similar to the published paper can be downloaded.

**Locally specified credal
networks
**Authors: Alessandro Antonucci and Marco Zaffalon.

Year: 2006.

Abstract: Credal networks are models that extend Bayesian nets to deal with imprecision in probability, and can actually be regarded as sets of Bayesian nets. Evidence suggests that credal nets are a powerful means to represent and deal with many important and challenging problems in uncertain reasoning. We give examples to show that some of these problems can only be modelled by credal nets called

Published in Studený, M., Vomlel, J. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Decision-theoretic specification of credal networks: a unified language for uncertain modeling with sets of Bayesian networks.

**Classification
of dementia types from cognitive profiles data
**Authors: Giorgio Corani, Chris Edgar, Isabelle Marshall, Keith Wesnes and
Marco Zaffalon.

Year: 2006.

Abstract: The Cognitive Drug Research (CDR) system is specifically validated for dementia assessment; it consists of a series of computerized tests, which assess the cognitive faculties of the patient to derive a

Published in

A version similar to the published paper can be downloaded.

**Binarization
algorithms for approximate updating in credal nets
**Authors: Alessandro Antonucci, Marco Zaffalon, Jaime S. Ide and Fabio G. Cozman.

Year: 2006.

Abstract: Credal networks generalize Bayesian networks relaxing numerical parameters. This considerably expands expressivity, but makes belief updating a hard task even on polytrees. Nevertheless, if all the variables are binary, polytree-shaped credal networks can be efficiently updated by the 2U algorithm. In this paper we present a

Published in Penserini, L., Peppas, P., Perini, A. (Eds),

A version similar to the published paper can be downloaded.

**Equivalence
between Bayesian and credal nets on an updating problem
**Authors: Alessandro Antonucci and Marco Zaffalon.

Year: 2006.

Abstract: We establish an intimate connection between Bayesian and credal nets. Bayesian nets are precise graphical models, credal nets extend Bayesian nets to imprecise probability. We focus on traditional belief updating with credal nets, and on the kind of belief updating that arises with Bayesian nets when the reason for the missingness of some of the unobserved variables in the net is unknown. We show that the two updating problems are formally the same.

Published in Lawry, J., Miranda, E., Bugarin, A., Li, S., Gil, M. A., Grzegorzewski, P., Hryniewicz, O. (Eds), SMPS 2006,

A version similar to the published paper can be downloaded.

**Robust inference of trees
**Authors: Marco Zaffalon and Marcus Hutter.

Year: 2005.

Abstract: This paper is concerned with the reliable inference of optimal tree-approximations to the dependency structure of an unknown distribution generating the data. The traditional approach to the problem measures the dependency strength between random variables by the index called

Published in

A version similar to the published paper can be downloaded.

**Fast algorithms for robust classification with Bayesian nets
**Authors: Alessandro Antonucci and Marco Zaffalon.

Year: 2005.

Abstract: We focus on a well-known classification task with expert systems based on

Published in Cozman, F. G., Nau, R., Seidenfeld, T. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Fast algorithms for robust classification with Bayesian nets.

**Limits of learning from imperfect observations under prior ignorance: the case of the imprecise Dirichlet model
**Authors: Alberto Piatti, Marco Zaffalon and Fabio Trojani.

Year: 2005.

Abstract: Consider a relaxed multinomial setup, in which there may be mistakes in observing the outcomes of the process—this is often the case in real applications. What can we say about the next outcome if we start learning about the process in conditions of prior ignorance? To answer this question we extend the imprecise Dirichlet model to the case of imperfect observations and we focus on posterior predictive probabilities for the next outcome. The results are very surprising: the posterior predictive probabilities are vacuous, irrespectively of the amount of observations we do, and however small is the probability of doing mistakes. In other words, the imprecise Dirichlet model cannot help us to learn from data when the observational mechanism is imperfect. This result seems to rise a serious question about the use of the imprecise Dirichlet model for practical applications, and, more generally, about the possibility to learn from imperfect observations under prior ignorance.

Published in Cozman, F. G., Nau, R., Seidenfeld, T. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Limits of learning about a categorical latent variable under prior near-ignorance.

**Conservative
rules for predictive inference with incomplete data
**Author: Marco Zaffalon.

Year: 2005.

Abstract: This paper addresses the following question: how should we update our beliefs after observing some incomplete data, in order to make credible predictions about new, and possibly incomplete, data? There may be several answers to this question according to the model of the process that creates the incompleteness. This paper develops a rigorous modelling framework that makes it clear the conditions that justify the different answers; and, on this basis, it derives a new conditioning rule for predictive inference to be used in a wide range of states of knowledge about the incompleteness process, including near-ignorance, which, surprisingly, does not seem to have received attention so far. Such a case is instead particularly important, as modelling incompleteness processes can be highly impractical, and because there are limitations to statistical inference with incomplete data: it is generally not possible to learn how incompleteness processes work by using the available data; and it may not be possible, as the paper shows, to measure empirically the quality of the predictions. Yet, these depend heavily on the assumptions made.

Published in Cozman, F. G., Nau, R., Seidenfeld, T. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Conservative inference rule for uncertain reasoning under incompleteness.

**Updating beliefs
with incomplete observations
**Authors: Gert de Cooman and Marco Zaffalon.

Year: 2004.

Abstract: Currently, there is renewed interest in the problem, raised by Shafer in 1985, of updating probabilities when observations are incomplete (or set-valued). This is a fundamental problem in general, and of particular interest for Bayesian networks. Recently, Grünwald and Halpern have shown that commonly used updating strategies fail in this case, except under very special assumptions. In this paper we propose a new method for updating probabilities with incomplete observations. Our approach is deliberately conservative: we make no assumptions about the so-called incompleteness mechanism that associates complete with incomplete observations. We model our ignorance about this mechanism by a vacuous lower prevision, a tool from the theory of imprecise probabilities, and we use only coherence arguments to turn prior into posterior (updated) probabilities. In general, this new approach to updating produces lower and upper posterior probabilities and previsions (expectations), as well as partially determinate decisions. This is a logical consequence of the existing ignorance about the incompleteness mechanism. As an example, we use the new updating method to properly address the apparent paradox in the 'Monty Hall' puzzle. More importantly, we apply it to the problem of classification of new evidence in probabilistic expert systems, where it leads to a new, so-called conservative updating rule. In the special case of Bayesian networks constructed using expert knowledge, we provide an exact algorithm to compare classes based on our updating rule, which has linear-time complexity for a class of networks wider than polytrees. This result is then extended to the more general framework of credal networks, where computations are often much harder than with Bayesian nets. Using an example, we show that our rule appears to provide a solid basis for reliable updating with incomplete observations, when no strong assumptions about the incompleteness mechanism are justified.

Published in

A version similar to the published paper can be downloaded.

This paper has been reviewed by M. F. McNally in Computing Reviews **46**(12),
2005.

**Assessing debris flow hazard by credal nets
**Authors: Alessandro Antonucci, Andrea Salvetti and Marco Zaffalon.

Year: 2004.

Published in López-Díaz, M., Gil, M. A., Grzegorzewski, P., Hryniewicz, O., Lawry, J. (Eds), SMPS 2004,

A version similar to the published paper can be downloaded.

**Hazard assessment of debris flows by credal networks
**Authors: Alessandro Antonucci, Andrea Salvetti and Marco Zaffalon.

Year: 2004.

Abstract: Debris flows are destructive natural hazards that affect human life, buildings, and infrastructures. Despite their importance, debris flows are only partially understood, and human expertise still plays a key role for hazard identification. This paper proposes filling the modelling gap by using

Published in Pahl-Wostl, C., Schmidt, S., Rizzoli, A. E., Jakeman, A. J. (Eds),

A version similar to the published paper can be downloaded.

**Credal networks for hazard assessment of debris flows
**Authors: Alessandro Antonucci, Andrea Salvetti and Marco Zaffalon.

Year: 2007.

Published in Kropp, J., Scheffran, J. (Eds),

A version similar to the published paper can be downloaded.

**Distribution of mutual information from complete and incomplete data
**Authors: Marcus Hutter and Marco Zaffalon.

Year: 2005.

Abstract: Mutual information is widely used, in a descriptive way, to measure the stochastic dependence of categorical random variables. In order to address questions such as the reliability of the descriptive value, one must consider sample-to-population inferential approaches. This paper deals with the posterior distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean, and analytical approximations for the variance, skewness and kurtosis are derived. These approximations have a guaranteed accuracy level of the order O(n^{-3}), where n is the sample size. Leading order approximations for the mean and the variance are derived in the case of incomplete samples. The derived analytical expressions allow the distribution of mutual information to be approximated reliably and quickly. In fact, the derived expressions can be computed with the same order of complexity needed for descriptive mutual information. This makes the distribution of mutual information become a concrete alternative to descriptive mutual information in many applications which would benefit from moving to the inductive side. We discuss some of these prospective applications, and for one of them, namely feature selection, we show how inductive mutual information leads to significant improvements in the inferred models.

Published in

A version similar to the published paper can be downloaded.

**Credible classification for environmental problems
**Author: Marco Zaffalon.

Year: 2005.

Abstract: Classifiers that aim at doing reliable predictions should rely on carefully elicited prior knowledge. Often this is not available so they should be able to start learning from data in condition of prior ignorance. This paper shows empirically, on an agricultural data set, that established methods of classification do not always adhere to this principle. Common ways to represent prior ignorance are shown to have an overwhelming weight compared to the information in the data, producing overconfident predictions. This point is crucial for problems, such as environmental ones, where prior knowledge is often scarce and even the data may not be known precisely. Credal classification, and in particular the naive credal classifier, are proposed as more faithful ways to represent states of ignorance. We show that with credal classification, conditions of ignorance may limit the power of the inferences, not the reliability of the predictions.

Published in

A version similar to the published paper can be downloaded.

**Bayesian treatment of incomplete discrete data applied to mutual information and feature selection**

Year: 2003. Abstract: Given the joint chances of a pair of random variables one can compute quantities of interest, like the mutual information. The Bayesian treatment of unknown chances involves computing, from a second order prior distribution and the data likelihood, a posterior distribution of the chances. A common treatment of incomplete data is to assume ignorability and determine the chances by the expectation maximization (EM) algorithm. The two different methods above are well established but typically separated. This paper joins the two approaches in the case of Dirichlet priors, and derives efficient approximations for the mean, mode and the (co)variance of the chances and the mutual information. Furthermore, we prove the unimodality of the posterior distribution, whence the important property of convergence of EM to the global maximum in the chosen framework. These results are applied to the problem of selecting features for incremental learning and naive Bayes classification. A fast filter based on the distribution of mutual information is shown to outperform the traditional filter based on empirical mutual information on a number of incomplete real data sets. Published in In Günter, A., Kruse, R., Neumann, B.,

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Distribution of mutual information from complete and incomplete data.

**Updating with incomplete observations**

Year: 2003. Abstract: Currently, there is renewed interest in the problem, raised by Shafer in 1985, of updating probabilities when observations are incomplete (or set-valued). This is a fundamental problem, and of particular interest for Bayesian networks. Recently, Grünwald and Halpern have shown that commonly used updating strategies fail here, except under very special assumptions. We propose a new rule for updating probabilities with incomplete observations. Our approach is deliberately conservative: we make no or weak assumptions about the so-called incompleteness mechanism that produces incomplete observations. We model our ignorance about this mechanism by a vacuous lower prevision, a tool from the theory of imprecise probabilities, and we derive a new updating rule using coherence arguments. In general, our rule produces lower posterior probabilities, as well as partially determinate decisions. This is a logical consequence of the ignorance about the incompleteness mechanism. We show how the new rule can properly address the apparent paradox in the `Monty Hall' puzzle. In addition, we apply it to the classification of new evidence in Bayesian networks constructed using expert knowledge. We provide an exact algorithm for this task with linear-time complexity, also for multiply connected nets. Published in Kjærulff, U., Meek, C. (Eds),

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Updating beliefs with incomplete observations.

**Tree-based credal networks for classification**

Authors: Marco Zaffalon and Enrico Fagiuoli.

Year: 2003.

Abstract: Bayesian networks are models for uncertain reasoning successfully applied to the data-mining task of classification. Credal networks extend Bayesian nets to sets of distributions, or credal sets. This paper extends a state-of-the-art Bayesian net for classification, called tree-augmented naive Bayes classifier, to credal sets originated from probability intervals. We formalize the new model, develop an exact linear-time classification algorithm, and evaluate the credal net-based classifier on a number of real data sets. The empirical analysis shows that the new classifier is good and reliable, and raises a problem of overly caution that is discussed in the paper. Overall, given the favorable trade-off between expressiveness and efficient computation, the newly proposed classifier appear to be a good candidate for the wide-scale application of credal networks to real and complex tasks.

Published in *Reliable Computing* **9**(6), 487–509.

A version similar to the published paper can be downloaded.

**Reliable diagnoses of dementia by the naive credal classifier inferred from incomplete cognitive data**

Authors: Marco Zaffalon, Keith Wesnes and Orlando Petrini.

Year: 2003.

Abstract: Dementia is a serious personal, medical and social problem. Recent research indicates early and accurate diagnoses as the key to effectively cope with it. No definitive cure is available but in some cases when the impairment is still mild the disease can be contained. This paper describes a diagnostic tool that jointly uses the naive credal classifier and the most widely used computerized system of cognitive tests in dementia research, the Cognitive Drug Research system. The naive credal classifier extends the discrete naive Bayes classifier to imprecise probabilities. The naive credal classifier models both prior ignorance and ignorance about the likelihood by sets of probability distributions. This is a new way to deal with small or incomplete data sets that departs significantly from most established classification methods. In the empirical study presented here the naive credal classifier provides reliability and unmatched predictive performance. It delivers up to 95% correct predictions while being very robust with respect to the partial ignorance due to the largely incomplete data. The diagnostic tool also proves to be very effective in discriminating between Alzheimer's disease and Dementia with Lewy Bodies.

Published in *Artificial Intelligence in Medicine* **29**(1–2), 61–79.

A version similar to the published paper can be downloaded.

**Robust feature selection by mutual information distributions**

Authors: Marco Zaffalon and Marcus Hutter.

Year: 2002.

Abstract: Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables. In order to address questions such as the reliability of the empirical value, one must consider sample-to-population inferential approaches. This paper deals with the distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean and an analytical approximation of the variance are reported. Asymptotic approximations of the distribution are proposed. The results are applied to the problem of selecting features for incremental learning and classification of the naive Bayes classifier. A fast, newly defined method is shown to outperform the traditional approach based on empirical mutual information on a number of real data sets. Finally, a theoretical development is reported that allows one to efficiently extend the above methods to incomplete samples in an easy and effective way.

Published in Darwiche, A., Friedman, N. (Eds), *UAI 2002*, Morgan Kaufmann 577–584.

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Distribution of mutual information from complete and incomplete data.

**Exact credal treatment of missing data**

Author: Marco Zaffalon.

Year: 2002.

Abstract: This paper proposes an exact, no-assumptions approach to dealing with incomplete sets of multivariate categorical data. An incomplete data set is regarded as a finite collection of complete data sets, and a joint distribution is obtained from each of them, at a descriptive level. The tools to simultaneously treat all the possible joint distributions compatible with an incomplete set of data are given. In particular, a linear description of the set of distributions is formulated, and it is shown that the computation of bounds on the expectation of real-valued functions under such distributions is both possible and efficient, by means of linear programming. Specific algorithms are also developed whose complexity grows linearly in the number of observations. An analysis is then carried out to estimate population probabilities from incomplete multinomial samples. The descriptive tool extends in a straightforward way to the inferential problem by exploiting Walley's imprecise Dirichlet model.

Published in *Journal of Statistical Planning and Inference*** 105**(1), 105–122.

A version similar to the published paper can be downloaded.

Author: Marco Zaffalon.

Year: 2002.

Abstract: Convex sets of probability distributions are also called credal sets. They generalize probability theory by relaxing the requirement that probability values be precise. Classification, i.e. assigning *class* labels to instances described by a set of *attributes*, is an important domain of application of Bayesian methods, where the naive Bayes classifier has a surprisingly good performance. This paper proposes a new method of classification which involves extending the naive Bayes classifier to credal sets. Exact and effective solution procedures for naive credal classification are derived, and the related dominance criteria are discussed. Credal classification appears as a new method, based on more realistic assumptions and in the direction of more reliable inferences.

Published in *Journal of Statistical Planning and Inference* **105**(1), 5–21.

A version similar to the published paper can be downloaded.

**Credal classification for dementia screening**

Authors: Marco Zaffalon, Keith Wesnes and Orlando Petrini.

Year: 2001.

Abstract: Dementia is a very serious personal, medical and social problem. Early and accurate diagnoses seem to be the key to effectively cope with it. This paper presents a diagnostic tool that couples the most widely used computerized system of cognitive tests in dementia research, the Cognitive Drug Research system, with the naive credal classifier. Although the classifier is trained on an incomplete database, it provides unmatched predictive performance and reliability. The tool also proves to be very effective in discriminating between Alzheimer's disease and dementia with Lewy bodies, which is a problem on the frontier of research on dementia.

Published in Quaglini, S., Barahona P., Andreassen, S. (Eds), *AIME '01*. Lecture Notes in Computer Science **2101**, Springer, 67–76.

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Reliable diagnoses of dementia by the naive credal classifier inferred from incomplete cognitive data.

**Robust discovery of tree-dependency structures**

Author: Marco Zaffalon.

Year: 2001.

Abstract: The problem of inferring dependency structures from random samples is a very fundamental topic in artificial intelligence and statistics. This paper reviews an early result from Chow and Liu on the approximation of unknown multinomial distributions by tree-dependency distributions, at the light of imprecise probabilities. Imprecision, arising here from Walley's imprecise Dirichlet model, generally makes many tree structures be plausible given the data. This paper focuses on the inference of the substructure common to all the possible trees. Such common pattern is a set of reliable dependencies. The problem of identifying the common pattern is abstracted and solved here in the general context of graph algorithms. On this basis, an algorithm is developed that infers reliable dependencies in time O(k^{3}), from a set of k binary random variables, that converge to a tree as the sample grows. The algorithm works by computing bounds on the solutions of global optimization problems. There are a number of reasons why trees are a very important special case of dependence graphs. This work appears as a significant step in the direction of discovering dependency structures under the realistic assumption of imprecise knowledge.

Published in de Cooman, G., Fine, T., Seidenfeld, T. (Eds), *ISIPTA '01*, Shaker Publishing, The Netherlands, 394–403.

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Robust inference of trees.

**Statistical inference of the naive credal classifier**

Author: Marco Zaffalon.

Year: 2001.

Abstract: In the wish list of the characteristics of a classifier, there are a reliable approach to small data sets and a clear and robust treatment of incomplete samples. This paper copes with such difficult problems by adopting the paradigm of credal classification. By exploiting Walley's imprecise Dirichlet model, it defines how to infer the naive credal classifier from a possibly incomplete multinomial sample. The derived procedure is exact and linear in the number of attributes. The obtained classifier is robust to small data sets and to all the possible missingness mechanisms. The results of some experimental analyses that compare the naive credal classifier with naive Bayesian models support the presented approach.

Published in de Cooman, G., Fine, T., Seidenfeld, T. (Eds), *ISIPTA '01*, Shaker Publishing, The Netherlands, 384–393.

A version similar to the published paper can be downloaded.

**An optimization methodology for intermodal terminal management**

Authors: Luca Maria Gambardella, Andrea E. Rizzoli, Monaldo Mastrolilli and
Marco Zaffalon.

Year: 2001.

Abstract: A solution to the problems of resource allocation and scheduling of loading and unloading operations in a container terminals is presented. The two problems are formulated and solved hierarchically. First, the solution of the resource allocation problem returns, over a number of work shifts, a set of quay cranes used to load and unload containers from the moored ships and the set of yard cranes to store those containers on the yard. Then, a scheduling problem is formulated to compute the loading and unloading lists of containers for each allocated crane. The feasibility of the solution is verified against a detailed, discrete-event based, simulation model of the terminal. The simulation results show that the optimized resource allocation, which reduce the costs by 40%, can be effectively adopted in combination with the optimized loading and unloading list. Moreover, the simulation shows that the optimized lists reduce the number of crane conflicts on the yard and the average length of the truck queues in the terminal.

Published in *Journal of Intelligent Manufacturing* **12**, 521–534.

A version similar to the published paper can be downloaded.

**Tree-augmented naive credal classifiers**

Authors: Enrico Fagiuoli and Marco Zaffalon.

Year: 2000.

Abstract: Credal classification is a generalization of common classification which allows a coherent treatment of imprecision to be realized. This paper defines a credal classifier by extending the discrete tree-augmented naive Bayes classifiers to sets of probability distributions. A solution algorithm is developed whose computational complexity is linear in the number of features. A method to induce the credal classifier from data is proposed.

Published in *IPMU 2000*, Universidad Politécnica de Madrid, Spain, 1320–1327.

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is Tree-based credal networks for classification.

**A credal approach to naive classification**

Author: Marco Zaffalon.

Year: 1999.

Abstract: Convex sets of probability distributions are also called credal sets. They generalize probability theory with special regard to the relaxation of the precision requirement about the probability values. Classification, i.e., assigning class labels to instances described by a set of attributes, is a typical domain of application of Bayesian methods, where the naive Bayesian classifier is considered among the best tools. This paper explores the classification model obtained when the naive Bayesian classifier is extended to credal sets. Exact and effective solution procedures for classification are derived, and the related dominance criteria are discussed. A methodology to induce the classifier from data is proposed.

Published in de Cooman, G., Cozman, F. G., Moral, S., Walley, P. (Eds), *ISIPTA '99*, The Imprecise Probabilities Project, Universiteit Gent, Belgium, 405–414.

A version similar to the published paper can be downloaded.

A revised and updated version of this conference paper is The naive credal classifier.

**2U: an exact interval propagation algorithm for polytrees with binary variables**

Authors: Enrico Fagiuoli and Marco Zaffalon.

Year: 1998.

Abstract: This paper addresses the problem of computing posterior probabilities in a discrete Bayesian network where the conditional distributions of the model belong to convex sets. The computation on a general Bayesian network with convex sets of conditional distributions is formalized as a global optimization problem. It is shown that such a problem can be reduced to a combinatorial problem, suitable to exact algorithmic solutions. An exact propagation algorithm for the updating of a polytree with binary variables is derived. The overall complexity is linear to the size of the network, when the maximum number of parents is fixed.

Published in *Artificial Intelligence* **106**(1), 77–107.

A version similar to the published paper can be downloaded.

**A note about redundancy in influence diagrams**

Authors: Enrico Fagiuoli and Marco Zaffalon.

Year: 1998.

Abstract: Influence diagrams are formal tools for modelling decision processes and for computing optimal strategies under risk. Like Bayesian networks, influence diagrams exploit the sparsity of the dependency relationships among the random variables in order to reduce computational complexity. In this note, we initially observe that an influence diagram can have some arcs that are not necessary for a complete description of the model. We show that while it may not be easy to detect such arcs, it is important, since a redundant graphical structure can exponentially increase the computational time of a solution procedure. Then we define a graphical criterion that is shown to allow the identification and removal of the redundant parts of an influence diagrams. This technical result is significant because it precisely defines what is relevant to know at the time of a decision. Furthermore, it allows a redundant influence diagram to be transformed into another influence diagram, that can be used to compute the optimal policy in an equivalent but more efficient way.

Published in *International Journal of Approximate Reasoning ***19**(3–4),
231–246.

A version similar to the published paper can be downloaded.

**Simulation and planning of an intermodal container terminal**

Authors: Luca Maria Gambardella, Andrea E. Rizzoli and Marco Zaffalon.

Year: 1998.

Abstract: A decision support system for the management of an intermodal container terminal is presented. Among the problems to be solved, there are the spatial allocation of containers on the terminal yard, the allocation of resources and the scheduling of operations in order to maximise a performance function based on some economic indicators. These problems are solved using techniques from optimisation, like job-shop scheduling, genetic algorithms or mixed-integer linear programming. At the terminal, the same problems are usually solved by the terminal manager, only using his/her experience. The manager can trust computer generated solutions only by validating them by means of a simulation model of the terminal. Thus, the simulation tool also becomes a means to introduce new approaches into traditional settings. In the present paper we focus on the resource allocation problem. We describe our modules for the optimisation of the allocation process and for the simulation of the terminal. The former is based on integer linear programming; the latter is a discrete event simulation tool, based on the process-oriented paradigm. The simulator provides a test bed for checking the validity and the robustness of the policy computed by the optimisation module. The case study of the Contship La Spezia Container Terminal, located in the Mediterranean Sea in Italy, is examined.

Published in *Simulation* **71**(2), 107–116.

A version similar to the published paper can be downloaded.

**Constraint logic programming and integer programming approaches and their collaboration in solving an assignment scheduling problem**

Authors: Ken Darby-Dowman, James Little, Gautam Mitra and Marco Zaffalon.

Year: 1997.

Abstract: Generalised assignment problems, traditionally solved by integer programming techniques, are addressed in the light of current constraint programming methods. A scheduling application from manufacturing, based on a modified generalised assignment problem, is used to examine the performance of each technique under a variety of problem characteristics. Experimental evidence showed that, for a set of assignment problems, constraint logic programming performed consistently better than integer programming. Analysis of the constraint logic programming and integer programming processes identified ways in which the search was effective. The insight gained from the analysis led to an integer programming approach with significantly improved performance. Finally, the issue of collaboration between the two contrasting approaches is examined with respect to ways in which the solvers can be combined in an effective manner.

Published in *Constraints* **1**(3), 245–264.

A version similar to the published paper can be downloaded.