Advanced Data Science
  • Home
  • Schedule/Syllabus
  • Exercises
  • Homework and Presentations
  • Instructors
    • Brian Caffo
    • John Muschelli
  • Resources

On this page

  • 1) One‑sample t‑test power & n (10 minutes)
  • 2) Two‑sample t‑test (equal variance) (8 minutes)
  • 3) Paired t‑test (5 minutes)
  • 4) Simple linear regression: test a slope (10 minutes)
  • 5) Multiple regression: overall (\(R^2\)) (7 minutes)
  • 6) Sample size for prediction interval width in SLR (10 minutes)
  • 7) (Optional) Monte Carlo verification (5 minutes)
    • Quick reference

Power & Sample Size

  • Show All Code
  • Hide All Code

  • View Source

Estimating Sample Size and Power from Parameters

Author

03-Power and Sample Size

Conventions: Two‑sided tests at α = 0.05 unless stated. Use R where helpful. Symbols: δ = effect size in the outcome scale; σ = SD; n = total sample unless noted. For two‑sample tests, assume equal allocation and equal variances unless stated.


1) One‑sample t‑test power & n (10 minutes)

Setup. You expect observations (\(Y\sim \mathcal{N}(\mu,\sigma^2)\)) with (\(\sigma=10\)). You will test (\(H_0: \mu=0\)) vs (\(H_1: \mu\neq 0\)). The scientifically relevant effect is (\(\delta = 3\)) (i.e., \(\mu=3\)).

1A. Compute power for (\(n=50\)).

Answer

Analytic normal approx: (\(Z = \bar Y/(\sigma/\sqrt n)\)), noncentrality (\(\lambda = \delta\sqrt n/\sigma\)).

[ = 3/10 = 3/10 . ]

Two‑sided power (z‑approx): [ 1 - (z_{0.975}-) + (-z_{0.975}-) -(1.96-2.121)+(-1.96-2.121) -(-0.161)+(-4.081). ] This is (-(0.436)+ ). Using exact t power:

Code
power.t.test(n = 50, delta = 3, sd = 10, sig.level = 0.05,
             type = "one.sample", alternative = "two.sided")$power

1B. Required (n) for 80% power.

Answer

Closed‑form (z‑approx): (\(n \approx (z_{0.975}+z_{0.8})^2, \frac{\sigma^2}}{\delta^2}\) ).

[ n (1.96+0.84)^2 = (2.80)^2 /9 . ]

Exact via R:

Code
power.t.test(power = 0.8, delta = 3, sd = 10, sig.level = 0.05,
             type = "one.sample", alternative = "two.sided")$n

2) Two‑sample t‑test (equal variance) (8 minutes)

Setup. Two arms, equal allocation, common (\(\sigma=12\)), difference of means (\(\delta = 5\)).

2A. Per‑group (n) for 90% power.

Answer

Z‑approx per‑group (n): (\(n \approx 2 (z_{0.975}+z_{0.90})^2 \frac{\sigma^2}{\delta^2}\) ).

\[ n \approx 2(1.96+1.282)^2(12^2)/5^2 = 2(3.242)^2\cdot144/25 \approx 2\cdot10.52\cdot5.76 \approx 121.1 \]

So about 122 per group. Exact:

Code
power.t.test(power = 0.9, delta = 5, sd = 12, sig.level = 0.05,
             type = "two.sample", alternative = "two.sided")$n

2B. Power if (\(n=40\)) per group.

Answer
Code
power.t.test(n = 40, delta = 5, sd = 12, sig.level = 0.05,
             type = "two.sample", alternative = "two.sided")$power

3) Paired t‑test (5 minutes)

Setup. Pre/post measurements with (\(\sigma_{\text{pre}}=\sigma_{\text{post}}=8\)), correlation (\(\rho=0.6\)). Mean change (\(\delta=2\)).

Question. Required number of pairs for 80% power.

Answer

SD of differences: (\(\sigma_D=\sqrt{\sigma^2+\sigma^2-2\rho\sigma^2}=\sqrt{2\sigma^2(1-\rho)} = 8\sqrt{2(1-0.6)}=8\sqrt{0.8}\approx 7.155\).) Use one‑sample t-est on differences with (sd=\(\sigma_D\)), effect (\(\delta\)):

Code
power.t.test(power = 0.8, delta = 2, sd = 7.155, sig.level = 0.05,
             type = "one.sample", alternative = "two.sided")$n

4) Simple linear regression: test a slope (10 minutes)

Setup. Model \(Y=\beta_0+\beta_1 X+\varepsilon\), \(\varepsilon \sim N(0,\sigma^2)\) with (\(\sigma=10\)) and (X) standardized: (\(\bar{X} \approx 0\)), (\(\operatorname{Var}(X)=1\)).

Suppose (\(\beta_1=3\)). Test (\(H_0:\beta_1=0\)).

4A. Power at (\(n=60\)) using the noncentral t formulation.

Answer

For centered/standardized (X), (\(S_{xx} = \sum (x_i-\bar x)^2 \approx n-1\)). The test statistic for (\(\beta_1\)) has noncentrality paramter \[ \lambda = \frac{\beta_1}{\sigma}\sqrt{S_{xx}} \approx \frac{3}{10}\sqrt{59} = 0.3 \cdot 7.681 \approx 2.304 \] Power (two‑sided) is (\(1-\beta\)) with df=(\(n-2=58\)):

Code
n <- 60; 
sigma <- 10; 
beta1 <- 3
lambda <- beta1/sigma * sqrt(n-1)
alpha <- 0.05; 
df <- n - 2
c <- qt(1 - alpha/2, df)
# power = P(|T_nc| > c)
1 - (pt(c, df, ncp=lambda) - pt(-c, df, ncp=lambda))

4B. Alternative via correlation. Show equivalence.

Answer

For SLR, (\(r=\operatorname{Cor}(X,Y)= \frac{\beta_1 \sigma_X}{\sigma_Y}\)). With (\(\sigma_X=1\)), (\(\sigma_Y = \sqrt{\beta_1^2+\sigma^2} = \sqrt{9+100} = \sqrt{109} \approx 10.44\)), so (\(r\approx 3/10.44\approx 0.287\)).

Test of (\(r=0\)) uses the same df and noncentrality (\(\sqrt{n-2}, \frac{r}{\sqrt{1-r^2}}\)), which equals (\(\lambda\)) above.

Code
r <- 3/sqrt(9+100)
lambda2 <- sqrt(n-2) * r / sqrt(1 - r^2)
all.equal(lambda, lambda2)

5) Multiple regression: overall (\(R^2\)) (7 minutes)

Setup. You plan (p=4) predictors and expect population (\(R^2=0.20\)). Test (\(H_0: R^2=0\)) (overall model).

Find power at (\(n=80\)) and required (n) for 80% power.

Answer

Use Cohen’s (\(f^2=\frac{R^2}{1-R^2}=0.25\)). The overall F test has df1=(p), df2=(n-p-1), and noncentrality (\(\lambda=f^2(n-p-1)\)).

Power at (n=80):

Code
p <- 4; R2 <- 0.20; f2 <- R2/(1-R2)
n <- 80; df1 <- p; df2 <- n - p - 1
lambda <- f2 * df2
alpha <- 0.05; Fc <- qf(1 - alpha, df1, df2)
# P(F_nc > Fc)
power <- 1 - pf(Fc, df1, df2, ncp=lambda)
power

Solve for (n) (80% power):

Code
f_target <- function(n){
  df2 <- n - p - 1
  if (df2 <= 0) return(-1)  # invalid
  Fc <- qf(0.95, df1=p, df2=df2)
  1 - pf(Fc, df1=p, df2=df2, ncp=f2*df2) - 0.80
}
uniroot(f_target, c(p+5, 1000))$root

6) Sample size for prediction interval width in SLR (10 minutes)

Setup. In \(Y=\beta_0+\beta_1 X+\varepsilon\) with \(\varepsilon\sim N(0,\sigma^2)\), suppose (\(\sigma=10\)) and you plan (\(X\)) standardized (mean 0, var 1). You want a 95% prediction interval for a future observation at (\(x_0\)) to have total width ($).

Consider two cases: (i) (\(x_0=0\)) (mean of (X)), (ii) (\(x_0=1\)) (1 SD from mean).

Facts. The half‑width (HW) at (\(x_0\)) is \[ \text{HW}(x_0) = t_{0.975,,n-2} \sigma \sqrt{1+\frac{1}{n}+\frac{(x_0-\bar x)^2}{S_{xx}}},\quad S_{xx}=\sum (x_i-\bar x)^2\approx n-1. \]

The minimum achievable half‑width is as (\(n\to\infty\)): (\(t_{\infty,0.975} \sigma)\) where \(t_{\infty,0.975}=1.96\).

6A. Feasibility check. With (\(\sigma=10\)), what is the smallest possible total width (\(W_\text{min}\))?

Answer

(\(W_\text{min}=2\times 1.96\times 10 \approx 39.2\)) Any target width (\(W<39.2\)) is impossible regardless of (\(n\)).

6B. Find (\(n\)) to achieve (\(W=48\)) (half‑width 24) at (\(x_0=0\)). You can brute force it, but also try using root-finding/optimization procedures.

Answer

Solve ( \(t_{0.975,n-2} 10 \sqrt{1+1/n} = 24\)). Use numeric root‑finding:

Code
sigma <- 10; HW <- 24
f <- function(n){
  df <- n - 2
  if (df <= 0) return(1e6)
  t <- qt(0.975, df)
  t * sigma * sqrt(1 + 1/n) - HW
}
ceiling(uniroot(f, c(5, 1e5))$root)

6C. Repeat for (\(x_0=1\)). Use (\(S_{xx}\approx n-1\)), (\(\bar x\approx 0\)).

Answer

Half‑width equation: ( \(t \sigma\sqrt{1+1/n+1/(n-1)} = 24\) ).

Code
sigma <- 10; HW <- 24
f <- function(n){
  df <- n - 2
  if (df <= 2) return(1e6)
  t <- qt(0.975, df)
  Sxx <- n - 1
  t * sigma * sqrt(1 + 1/n + 1/Sxx) - HW
}
ceiling(uniroot(f, c(6, 1e6))$root)

Note the larger (n) due to being 1 SD away from the design mean.

6D. (Optional) Design‑sensitive planning. If you can spread (X) to increase (\(S_{xx}\)) (e.g., choose equally spaced design over a range), how does that change (n) for fixed (W)?

Answer

Increasing (\(S_{xx}\)) shrinks the term \(\frac{(x_0- \bar{x})^2}{S_{xx}}\), reducing the half‑width for off‑mean predictions. For a fixed (\(n\)), wider X range lowers HW at (\(x_0\neq\bar x\)). Conversely, for fixed HW, larger (\(S_{xx}\)) allows smaller (n).


7) (Optional) Monte Carlo verification (5 minutes)

Simulate to verify analytic power for a two‑sample t‑test or slope test.

Answer
Code
set.seed(1)
B <- 5000
n <- 40; sd <- 12; delta <- 5; alpha <- 0.05
p <- replicate(B, {
  x1 <- rnorm(n, 0, sd); x2 <- rnorm(n, delta, sd)
  t.test(x2, x1, var.equal = TRUE)$p.value
})
mean(p < alpha)  # ~ power

Quick reference

  • One‑sample n (z‑approx): ( \(n = (z_{1-\alpha/2}+z_{1-\beta})^2 \frac{\sigma^2}{\delta^2}\) )
  • Two‑sample per‑group n: ( \(n = 2(z_{1-\alpha/2}+z_{1-\beta})^2 \frac{\sigma^2}{\delta^2}\) )
  • SLR slope power ncp: ( \(\lambda = (\beta_1/\sigma)\sqrt{S_{xx}} \approx (\beta_1/\sigma)\sqrt{n-1}\) )
  • Overall MR test: (\(f^2=R^2/(1-R^2)\), \(\lambda=f^2(n-p-1)\))
  • PI half‑width at (x_0): ( \(t\sigma\sqrt{1+1/n+(x_0-\bar x)^2/S_{xx}}\) )
Source Code
---
title: "Power & Sample Size"
subtitle: "Estimating Sample Size and Power from Parameters"
author: "03-Power and Sample Size"
format:
  html:
    toc: true
    code-fold: true
    code-tools: true
editor: source
engine: knitr
---


> **Conventions:** Two‑sided tests at α = 0.05 unless stated. Use R where helpful. Symbols: δ = effect size in the outcome scale; σ = SD; n = total sample unless noted. For two‑sample tests, assume equal allocation and equal variances unless stated.

------------------------------------------------------------------------

## 1) One‑sample t‑test power & n (10 minutes)

**Setup.** You expect observations ($Y\sim \mathcal{N}(\mu,\sigma^2)$) with ($\sigma=10$). You will test ($H_0: \mu=0$) vs ($H_1: \mu\neq 0$). The scientifically relevant effect is ($\delta = 3$) (i.e., $\mu=3$).

**1A.** Compute power for ($n=50$).

<details>

<summary>Answer</summary>

Analytic normal approx: ($Z = \bar Y/(\sigma/\sqrt n)$), noncentrality ($\lambda = \delta\sqrt n/\sigma$).

\[ \lambda = 3\cdot\sqrt{50}/10 = 3\cdot7.071/10 \approx 2.121. \]

Two‑sided power (z‑approx): \[ 1 - \Phi(z_{0.975}-\lambda) + \Phi(-z_{0.975}-\lambda) \approx 1-\Phi(1.96-2.121)+\Phi(-1.96-2.121) \approx 1-\Phi(-0.161)+\Phi(-4.081). \] This is (\approx 1-(0.436)+\text{tiny} \approx 0.564). Using exact **t** power:

```{r, eval = FALSE}
power.t.test(n = 50, delta = 3, sd = 10, sig.level = 0.05,
             type = "one.sample", alternative = "two.sided")$power
```

</details>

**1B.** Required (n) for 80% power.

<details>

<summary>Answer</summary>

Closed‑form (z‑approx): ($n \approx (z_{0.975}+z_{0.8})^2, \frac{\sigma^2}}{\delta^2}$ ).

\[ n \approx (1.96+0.84)^2\frac{10^2}{3^2} = (2.80)^2 \cdot 100/9 \approx 7.84 \cdot 11.11 \approx 87.1. \]

Exact via R:

```{r, eval = FALSE}
power.t.test(power = 0.8, delta = 3, sd = 10, sig.level = 0.05,
             type = "one.sample", alternative = "two.sided")$n
```

</details>

------------------------------------------------------------------------

## 2) Two‑sample t‑test (equal variance) (8 minutes)

**Setup.** Two arms, equal allocation, common ($\sigma=12$), difference of means ($\delta = 5$).

**2A.** Per‑group (n) for 90% power.

<details>

<summary>Answer</summary>

Z‑approx per‑group (n): ($n \approx 2 (z_{0.975}+z_{0.90})^2 \frac{\sigma^2}{\delta^2}$ ).

$$
n \approx 2(1.96+1.282)^2(12^2)/5^2 = 2(3.242)^2\cdot144/25 \approx 2\cdot10.52\cdot5.76 \approx 121.1 
$$ 

So about **122 per group**. Exact:

```{r, eval = FALSE}
power.t.test(power = 0.9, delta = 5, sd = 12, sig.level = 0.05,
             type = "two.sample", alternative = "two.sided")$n
```

</details>

**2B.** Power if ($n=40$) per group.

<details>

<summary>Answer</summary>

```{r, eval = FALSE}
power.t.test(n = 40, delta = 5, sd = 12, sig.level = 0.05,
             type = "two.sample", alternative = "two.sided")$power
```

</details>

------------------------------------------------------------------------

## 3) Paired t‑test (5 minutes)

**Setup.** Pre/post measurements with ($\sigma_{\text{pre}}=\sigma_{\text{post}}=8$), correlation ($\rho=0.6$). Mean change ($\delta=2$).

**Question.** Required number of pairs for 80% power.

<details>

<summary>Answer</summary>

SD of differences: ($\sigma_D=\sqrt{\sigma^2+\sigma^2-2\rho\sigma^2}=\sqrt{2\sigma^2(1-\rho)} = 8\sqrt{2(1-0.6)}=8\sqrt{0.8}\approx 7.155$.) Use one‑sample t-est on differences with (sd=$\sigma_D$), effect ($\delta$):

```{r, eval = FALSE}
power.t.test(power = 0.8, delta = 2, sd = 7.155, sig.level = 0.05,
             type = "one.sample", alternative = "two.sided")$n
```

</details>

------------------------------------------------------------------------

## 4) Simple linear regression: test a slope (10 minutes)

**Setup.** Model $Y=\beta_0+\beta_1 X+\varepsilon$, $\varepsilon \sim N(0,\sigma^2)$ with ($\sigma=10$) and (X) standardized: ($\bar{X} \approx 0$), ($\operatorname{Var}(X)=1$). 

Suppose ($\beta_1=3$). Test ($H_0:\beta_1=0$).

**4A.** Power at ($n=60$) using the noncentral t formulation.

<details>

<summary>Answer</summary>

For centered/standardized (X), ($S_{xx} = \sum (x_i-\bar x)^2 \approx n-1$). The test statistic for ($\beta_1$) has noncentrality paramter
$$
\lambda = \frac{\beta_1}{\sigma}\sqrt{S_{xx}} \approx \frac{3}{10}\sqrt{59} = 0.3 \cdot 7.681 \approx 2.304
$$
Power (two‑sided) is ($1-\beta$) with df=($n-2=58$):

```{r, eval = FALSE}
n <- 60; 
sigma <- 10; 
beta1 <- 3
lambda <- beta1/sigma * sqrt(n-1)
alpha <- 0.05; 
df <- n - 2
c <- qt(1 - alpha/2, df)
# power = P(|T_nc| > c)
1 - (pt(c, df, ncp=lambda) - pt(-c, df, ncp=lambda))
```

</details>

**4B.** Alternative via correlation. Show equivalence.

<details>

<summary>Answer</summary>

For SLR, ($r=\operatorname{Cor}(X,Y)= \frac{\beta_1 \sigma_X}{\sigma_Y}$). With ($\sigma_X=1$), ($\sigma_Y = \sqrt{\beta_1^2+\sigma^2} = \sqrt{9+100} = \sqrt{109} \approx 10.44$), so ($r\approx 3/10.44\approx 0.287$). 

Test of ($r=0$) uses the same df and noncentrality ($\sqrt{n-2}, \frac{r}{\sqrt{1-r^2}}$), which equals ($\lambda$) above.

```{r, eval = FALSE}
r <- 3/sqrt(9+100)
lambda2 <- sqrt(n-2) * r / sqrt(1 - r^2)
all.equal(lambda, lambda2)
```

</details>

------------------------------------------------------------------------

## 5) Multiple regression: overall ($R^2$) (7 minutes)

**Setup.** You plan (p=4) predictors and expect population ($R^2=0.20$). Test ($H_0: R^2=0$) (overall model). 

Find power at ($n=80$) and required (n) for 80% power.

<details>

<summary>Answer</summary>

Use Cohen's ($f^2=\frac{R^2}{1-R^2}=0.25$). The overall F test has df1=(p), df2=(n-p-1), and noncentrality ($\lambda=f^2(n-p-1)$).

Power at (n=80):

```{r, eval = FALSE}
p <- 4; R2 <- 0.20; f2 <- R2/(1-R2)
n <- 80; df1 <- p; df2 <- n - p - 1
lambda <- f2 * df2
alpha <- 0.05; Fc <- qf(1 - alpha, df1, df2)
# P(F_nc > Fc)
power <- 1 - pf(Fc, df1, df2, ncp=lambda)
power
```

Solve for (n) (80% power):

```{r, eval = FALSE}
f_target <- function(n){
  df2 <- n - p - 1
  if (df2 <= 0) return(-1)  # invalid
  Fc <- qf(0.95, df1=p, df2=df2)
  1 - pf(Fc, df1=p, df2=df2, ncp=f2*df2) - 0.80
}
uniroot(f_target, c(p+5, 1000))$root
```

</details>

------------------------------------------------------------------------

## 6) Sample size for **prediction interval width** in SLR (10 minutes)

**Setup.** In $Y=\beta_0+\beta_1 X+\varepsilon$ with $\varepsilon\sim N(0,\sigma^2)$, suppose ($\sigma=10$) and you plan ($X$) standardized (mean 0, var 1). You want a **95% prediction interval** for a *future* observation at ($x_0$) to have **total width** ($). 

Consider two cases: (i) ($x_0=0$) (mean of (X)), (ii) ($x_0=1$) (1 SD from mean).

**Facts.** The half‑width (HW) at ($x_0$) is 
$$
\text{HW}(x_0) = t_{0.975,,n-2} \sigma \sqrt{1+\frac{1}{n}+\frac{(x_0-\bar x)^2}{S_{xx}}},\quad S_{xx}=\sum (x_i-\bar x)^2\approx n-1. 
$$

The **minimum achievable** half‑width is as ($n\to\infty$): ($t_{\infty,0.975} \sigma)$ where $t_{\infty,0.975}=1.96$.

**6A.** Feasibility check. With ($\sigma=10$), what is the smallest possible total width ($W_\text{min}$)?

<details>

<summary>Answer</summary>

($W_\text{min}=2\times 1.96\times 10 \approx 39.2$) Any target width ($W<39.2$) is **impossible** regardless of ($n$).

</details>

**6B.** Find ($n$) to achieve ($W=48$) (half‑width 24) at ($x_0=0$).  You can brute force it, but also try using root-finding/optimization procedures.

<details>

<summary>Answer</summary>

Solve ( $t_{0.975,n-2} 10 \sqrt{1+1/n} = 24$). Use numeric root‑finding:

```{r, eval = FALSE}
sigma <- 10; HW <- 24
f <- function(n){
  df <- n - 2
  if (df <= 0) return(1e6)
  t <- qt(0.975, df)
  t * sigma * sqrt(1 + 1/n) - HW
}
ceiling(uniroot(f, c(5, 1e5))$root)
```

</details>

**6C.** Repeat for ($x_0=1$). Use ($S_{xx}\approx n-1$), ($\bar x\approx 0$).

<details>

<summary>Answer</summary>

Half‑width equation: ( $t \sigma\sqrt{1+1/n+1/(n-1)} = 24$ ).

```{r, eval = FALSE}
sigma <- 10; HW <- 24
f <- function(n){
  df <- n - 2
  if (df <= 2) return(1e6)
  t <- qt(0.975, df)
  Sxx <- n - 1
  t * sigma * sqrt(1 + 1/n + 1/Sxx) - HW
}
ceiling(uniroot(f, c(6, 1e6))$root)
```

Note the larger (n) due to being 1 SD away from the design mean.

</details>

**6D. (Optional)** Design‑sensitive planning. If you can spread (X) to increase ($S_{xx}$) (e.g., choose equally spaced design over a range), how does that change (n) for fixed (W)?

<details>

<summary>Answer</summary>

Increasing ($S_{xx}$) shrinks the term $\frac{(x_0- \bar{x})^2}{S_{xx}}$, reducing the half‑width for off‑mean predictions. For a fixed ($n$), wider X range lowers HW at ($x_0\neq\bar x$). Conversely, for fixed HW, larger ($S_{xx}$) allows smaller (n).

</details>

------------------------------------------------------------------------

## 7) (Optional) Monte Carlo verification (5 minutes)

Simulate to verify analytic power for a two‑sample t‑test or slope test.

<details>

<summary>Answer</summary>

```{r, eval = FALSE}
set.seed(1)
B <- 5000
n <- 40; sd <- 12; delta <- 5; alpha <- 0.05
p <- replicate(B, {
  x1 <- rnorm(n, 0, sd); x2 <- rnorm(n, delta, sd)
  t.test(x2, x1, var.equal = TRUE)$p.value
})
mean(p < alpha)  # ~ power
```

</details>

------------------------------------------------------------------------

### Quick reference

-   One‑sample n (z‑approx): ( $n = (z_{1-\alpha/2}+z_{1-\beta})^2 \frac{\sigma^2}{\delta^2}$ )
-   Two‑sample per‑group n: ( $n = 2(z_{1-\alpha/2}+z_{1-\beta})^2 \frac{\sigma^2}{\delta^2}$ )
-   SLR slope power ncp: ( $\lambda = (\beta_1/\sigma)\sqrt{S_{xx}} \approx (\beta_1/\sigma)\sqrt{n-1}$ )
-   Overall MR test: ($f^2=R^2/(1-R^2)$, $\lambda=f^2(n-p-1)$)
-   PI half‑width at (x_0): ( $t\sigma\sqrt{1+1/n+(x_0-\bar x)^2/S_{xx}}$ )