What Is a p-value?
A p-value quantifies the probability of observing results as extreme as those obtained in a study, assuming the null hypothesis (H₀) is true. It answers: “If the null hypothesis is true, how likely is my data?”
Key Definitions
- Null Hypothesis (H₀): The default assumption (e.g., “no effect”).
- Alternative Hypothesis (H₁): The claim being tested (e.g., “an effect exists”).
- Test Statistic: A standardized value (e.g., Z-score, t-score) calculated from sample data.
Historical Context
The p-value was popularized by Ronald Fisher in the 1920s. Fisher suggested a threshold of 0.05 for statistical significance, a convention still debated today.
Formula
The p-value depends on the test statistic and the hypothesis test type:
General Formula
where is the test statistic and is its observed value.
Z-test
For a Z-test with Z-score :
- Left-tailed:
- Right-tailed:
- Two-tailed:
t-test
For a t-test with -score and :
- Left-tailed:
- Right-tailed:
- Two-tailed:
Chi-square (χ²) Test
For χ²-score with degrees of freedom:
- Left-tailed:
- Right-tailed:
F-test
For F-score with degrees of freedom:
- Left-tailed:
- Right-tailed:
Examples
Example 1: Z-test for Population Mean
Scenario: A factory claims lightbulbs last 1,200 hours. A sample of 50 bulbs has , . Test if the mean is less than claimed.
Solution:
- Left-tailed p-value: .
Conclusion: Fail to reject H₀ at .
Example 2: Chi-square Test for Independence
Scenario: A survey tests if gender (Male/Female) and preference (Yes/No) are independent. Observed χ² = 6.25, .
Solution:
- Right-tailed p-value: .
Conclusion: Reject H₀ at .
Interpretation Guide
- p-value < 0.01: Strong evidence against H₀.
- 0.01 ≤ p-value < 0.05: Moderate evidence against H₀.
- p-value ≥ 0.05: Insufficient evidence to reject H₀.
Common Misconceptions
- Myth: A high p-value “proves” H₀.
Truth: It only suggests insufficient evidence against H₀. - Myth: p-value = Probability H₀ is true.
Truth: p-value assumes H₀ is true; it does not measure H₀’s likelihood.
Frequently Asked Questions
Can a p-value be negative?
No. P-values represent probabilities and must be between 0 and 1.
How to interpret a p-value of 0.07?
At , you fail to reject H₀. However, this result is marginally significant and warrants further study.
Why is 0.05 a common significance level?
Popularized by Fisher, 0.05 balances Type I error (false positives) and sensitivity. However, it’s arbitrary and field-dependent (e.g., physics uses , ).
How does sample size affect p-values?
Larger samples increase test sensitivity, making it easier to detect small effects. Always report effect size (e.g., Cohen’s d) alongside p-values.
What’s the difference between one-tailed and two-tailed tests?
- One-tailed: Tests for an effect in one direction (e.g., “greater than”).
- Two-tailed: Tests for any directionless effect. Uses the tail probability.