# Statistics course for PhD students

## Introduction

Welcome to the Statistics for Physical Scientists short course! It's designed to give researchers, particularly in the physical sciences, some practical background and guidance in applying common statistical tools. The course covers:
• basic summary statistics, probability distributions and data combinations,
• overview of the Frequentist and Bayesian frameworks,
• correlation testing and significance, and sample comparisons,
• hypothesis tests and p-values,
• model-fitting and hypothesis testing using the chi-squared statistic,
• regression analysis (including least-squares and Gaussian processes),
• principal component analysis,
• practical error estimates (jack-knife, bootstrap and Monte Carlo simulations),
• propagating errors and Fisher matrix,
• Bayesian likelihood methods (including MCMC) and model selection
The full introduction and content summary can be found here The course is structured in 6 classes, as described below, which are split into content presentation, worked examples and practical activities using the datasets provided. Each class comes with an accompanying python Jupyter notebook, which provides summary notes and code for all the worked examples.

## Useful books

The following is an (incomplete!) list of books which contain a great deal of practical wisdom in using statistics:
• Practical Statistics for Astronomers (Wall & Jenkins)
• Statistics for Nuclear and Particle Physicists (Lyons)
• Practical Bayesian Inference: A Primer for Physical Scientists (Bailer-Jones)
• Modern Statistical Methods for Astronomy (Feigelson & Babu)
• Principles of Data Analysis (Sahu)
• Bayesian Logical Data Analysis for the Physical Sciences (Gregory)
• Data Analysis: A Bayesian Tutorial (Sivia)
• Numerical Recipes: The Art of Scientific Computing (Press, Teukolsky, Vetterling, Flannery)

## Class material

### Datasets

Here are the datasets that are used in the worked examples and activities:

### Class 1: Probability and statistics

Here are the Class 1 content slides as pdf and powerpoint Here is the accompanying python Jupyter notebook for Class 1.

### Class 2: Correlation Testing

Here are the Class 2 content slides as pdf and powerpoint Here is the accompanying python Jupyter notebook for Class 2.

### Class 3: Model Fitting

Here are the Class 3 content slides as pdf and powerpoint Here is the accompanying python Jupyter notebook for Class 3.

### Class 4: Regression

Here are the Class 4 content slides as pdf and powerpoint Here is the accompanying python Jupyter notebook for Class 4.

### Class 5: Error Estimates

Here are the Class 5 content slides as pdf and powerpoint Here is the accompanying python Jupyter notebook for Class 5.

### Class 6: Bayesian Methods

Here are the Class 6 content slides as pdf and powerpoint Here is the accompanying python Jupyter notebook for Class 6.