# Statistical modeling: The two cultures

@article{Breiman2001StatisticalMT, title={Statistical modeling: The two cultures}, author={Leo Breiman}, journal={Quality Engineering}, year={2001}, volume={48}, pages={81-82} }

There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated bya given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical communityhas been committed to the almost exclusive use of data models. This commit- ment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current prob- lems

#### 1,556 Citations

Statistical Inference After Model Selection

- Computer Science
- 2010

This paper examines a variety of model selection procedures routinely undertaken followed by statistical tests and confidence intervals computed for a "final" model in criminology and shows how they are typically misguided.

Big Data is not only about data: The two cultures of modelling

- Computer Science
- 2017

A brief discussion of model-based recursive partitioning which can bridge the theory and data-driven approach to statistical modelling and is an example of how this new approach can help revise models that work for the full dataset.

Discussion Paper

- 2014

The views expressed in this paper are those of the author(s) and do not necessarily reflect the policies of Statistics Netherlands Data sources referred to as Big data become available for use by

6-2010 Statistical Inference After Model Selection

- 2017

Conventional statistical inference requires that a model of how the data were generated be known before the data are analyzed. Yet in criminology, and in the social sciences more broadly, a variety

Comment on "Statistical Modeling: The Two Cultures" by Leo Breiman

- Mathematics
- 2021

Motivated by Breiman's rousing 2001 paper on the "two cultures" in statistics, we consider the role that different modeling approaches play in causal inference. We discuss the relationship between

Distributional Trees and Forests

- 2017

Obtaining valuable information from given data requires the use of appropriate methods of analysis. For example, if a certain variable of interest is assumed to depend on a (set of) covariate(s),

A problem-solving approach to data analysis for economics

- Sociology
- 2018

Data analysis for formal methods is constrained due to the lengthy dominance of the econometric view within economics. Best practice in statistics suggests a shift in emphasis from making statements

Big data and its epistemology

- Computer Science
- J. Assoc. Inf. Sci. Technol.
- 2015

Whether Big Data, in the form of data‐driven science, will enable the discovery, or appraisal, of universal scientific theories, instrumentalist tools, or inductive inferences is considered.

It takes two to tango: Statistical modeling and machine learning

- Computer Science
- 2021

A scenario is created where it shows that when the learning from using a statistical method and applying it to machine learning, the ultimate benefit can be greater than the sum of each method's benefits.

The Causal Nature of Modeling with Big Data

- Computer Science
- 2016

It is shown to lack a pronounced hierarchical, nested structure and the significance of the transition to such "horizontal" modeling is underlined by the concurrent emergence of novel inductive methodology in statistics such as non-parametric statistics.

