Writing and Reviewing Research Papers
Department of Mathematical Sciences, Aalborg University
Statistics is a branch of mathematics that deals with the collection, analysis, interpretation, and presentation of data.
Data is sampled from a population and used to make inferences about the population.
It is a fundamental tool in research.
Statistics is used to summarize data.
It is used to make inferences about populations.
It is used to make informed decisions.
It is used to test hypotheses.
Descriptive statistics is used to summarize data.
It is used to describe the main features of a dataset.
It is used to present data in a meaningful way.
It is used to identify patterns in data.
Mean: Average value of a dataset.
Median: Middle value of a dataset.
Mode: Most frequent value in a dataset.
Range: Difference between the maximum and minimum values.
Variance: Average of the squared differences from the mean.
Standard deviation: Square root of the variance.
Interquartile range: Difference between the 75th and 25th percentiles.
"Measures of dispersion for X"
1×3 Matrix{String}:
"variance" "std" "range"
1×3 Matrix{Float64}:
47.9362 6.9236 28.13
" "
"Measures of dispersion for Y"
1×3 Matrix{String}:
"variance" "std" "range"
1×3 Matrix{Float64}:
0.048988 0.221332 1.01
Scatter plot: Relationship between two variables.
Histogram: Distribution of a variable.
Box plot: Distribution of a variable, quartiles.
Density plot: Distribution of a variable, smoothed.
Inferential statistics is used to make inferences about populations.
It is used to test hypotheses.
It is used to make informed decisions.
It is used to estimate parameters.
Null and Alternative hypothesis.
Types of error (Type I and Type II).
P-value.
Confidence interval.
Null hypothesis: No effect or no difference.
Alternative hypothesis: Effect or difference.
Example: Null hypothesis: The vaccine has no effect. Alternative hypothesis: The vaccine has an effect.
Type I error: Rejecting the null hypothesis when it is true.
Type II error: Failing to reject the null hypothesis when it is false.
Example: Type I error: Jail an innocent person. Type II error: Free a guilty person.
The probability of observing the data given that the null hypothesis is true.
It is used to test hypotheses.
(For historical reasons) It is compared to a threshold, usually 0.05.
A range of values that is likely to contain the true value of a parameter.
It is used to estimate parameters.
(For historical reasons) It is usually set at 95%.
Do use the right measure of central tendency.
Don’t use the mean when the data is skewed or has outliers.
Do use the right measure of dispersion.
Don’t use the variance when you have outliers.
Do use standard deviation to preserve the units of the data.
Don’t say we proved the hypothesis.
Do say the data supports the hypothesis.
Do report confidence intervals.
Don’t confuse improbability with impossibility.
Selection bias: When the sample is not representative of the population.
Confirmation bias: When we look for evidence that confirms our beliefs.
Publication bias: When only significant results are published.
Extrapolation bias: When we extrapolate beyond the data.
Causation bias: When we confuse correlation with causation.
Ask questions, use PhD consult.
Consider joining the Design and Analysis of Experiments PhD Course.
More questions? eduardo@math.aau.dk
Do and don’ts of statistics in research