The importance of n (sample size) in Statistics


n = number in a trial or sample.

Why is sample size important?

The aim of statistical testing is to uncover a significant difference when it actually exists. In its simplest form this involves comparing samples between one regime and another (which may be a control). Sample size is important because

Why does a larger sample size help?

The sample size is chosen to maximise the chance of uncovering a specific mean difference, which is also statistically significant. Please note that specific difference and statistically significant are two quite different ideas.

The reason larger samples increase your chance of significance is because they more reliably reflect the population mean.

Imagine we are doing a trial on whether a particular diet regime helps with weight loss. A random sample of people are chosen and each person is weighed before and after the diet, giving us their weight changes. Finally we work out the mean weight change of the entire sample. To get a statistically significant result we want a result which is unlikely to have happened if the diet makes no difference (the null hypothesis).

Imagine a scenario where one researcher has a sample
size of 20, and another one, 40, both drawn from the
same population, and both happen to get a mean weight
change of 3kg. How likely is it that a 3kg weight change
will be statistically significant in these two scenarios?
To help us here we'll show a distribution curve
from each scenario.

What you see above are two distributions of possible sample means (see below) for 20 people (n=20) and 40 people (n=40), both drawn from the same population. On each we have superimposed a sample mean weight change of 3kg. The curves are both centred on zero to indicate a null hypothesis of "no difference" (ie. that the diet has no effect). It is more likely to be significant when n=40 because the distribution curve is narrower and 3kg is more extreme in relation to it than it is in the n=20 scenario, which points to how you can increase the power of your experiment. The reason the n=40 curve is spikier is because of something called the standard error of the mean. Essentially, the larger the sample sizes, the more accurately the sample will reflect the population it was drawn from, so it is distributed more closely around the population mean.

How reliably does the sample mean reflect the population mean?

Hopefully you will have an intuitive feeling that the larger your sample is, the more accurately it reflects the population: an exit poll at an election just asking two people how they voted is clearly less useful than one which asks 2,000 people. In Statistics this needs to be quantified and pinned down, and you want to make your sample as accurate as possible.

This reliability of the sample mean as a reflection of the population mean is quantified by something called the standard error of the mean (se), which is essentially the sd of the population of all the sample means that we would get if we took infinitely many random samples rather than just the one. The two curves above show the distributions for these for our two imaginary samples. (You can find out more about this in the section 'Numeric Data Description' in Statistics for the Terrified.) The standard error of the mean is calculated using two things:

In order to show that the weight change we have seen is significant and not just random weight fluctuations, our sample mean needs to appear at one edge of the curve. If our sample mean appears in the middle section of the curve, then the observed weight change could have happened by chance.

Notice that the curve showing the se of the sample with 20 people is much wider (covering a wider range of weight changes) than the curve of the se of the sample with 40 people. You can see that a change of 3kg is right up at the end of the n=40 curve (significant!), whereas it is more in the central region of the n=20 curve (not significant).

With a sample size of n=20 it is impossible to say whether the change of 3kg is down to chance or the diet. By increasing the sample size we increase the reliability of the sample means - making the curve narrower and spikier - and so any change we detect is more likely to be up at one extreme, and therefore statistically significant.

Calculating the optimum sample size

In reality of course you will have to decide on your sample size before you begin, and there is a formula for calculating n to best achieve a significant difference. This formula uses the specific difference and the sd of the population. As mentioned above, the specific difference is proposed by the researcher and the population sd has to be obtained from previously published research or from a pilot study.

Related issues

It is possible to get a statistically significant difference that is not relevant. Imagine you did a study of a new (but not very effective) fever control drug with so many people in the samples that you had a statistically significant finding with a mean drop in temperature of 0.1C. It may be statistically significant, but it won't be very relevant if you have a high fever!


Please contact us with any comments you may have on this debate.

  Statistics for the Terrified

DOWNLOAD Free Evaluation

Statistics for the Terrified is a tutorial which provides a thorough grounding in basic statistics for the non- mathematician, using straightforward english and commonsense explanations.
It assumes that you have no prior knowledge, and will guide you through
from first principles, demonstrating
what Statistics is, what it does, and
some common mistakes. It will enable
users to read and understand statistics quoted in published articles, and can be used as a refresher and a reference manual for professionals who use
Statistics in their work. The course is widely used in colleges and universities, and in commercial organisations.

Learn more...

"Concept Stew supported us from start to finish in implementing this for our students - from providing the software in the most appropriate format for our needs to offering a support line for our students, they have been with us all the way, and always with a cheery message to make us feel like valued customers. We're looking forward to working with them as the product develops."

Sharon Boyd
eProgramme Coordinator
Royal (Dick) School of Veterinary Studies

  Free resources:

  •   Statistics glossary
  •   What is risk?
  •   Conditional probability
  •   Median and mean
  •   Evening the odds
  •   The prosecutor's fallacy
  •   Clinical trials
  •   n - sample size

  Statistics for the Terrified:

  •   Download demo
  •   Buy a copy
  •   What is it?
  •   What does it cover?
  •   Who is it for?
  •   What our users say
  •   History of S4T
  •   User survey results

  Contact us:

  •   Contact details

  home   |    statistics for the terrified   |    download a demo   |    buy now (UK & EU)   |    buy now (US & Worldwide)   |    free resources   |    contact us   |    sitemap   |    FAQ