Probability and Statistics Foundations
Analyze data and calculate chances with mathematical precision
You encounter probability and statistics every single day, even if you do not realize it. When you check the weather forecast, you are looking at probability. When you hear about “average” test scores, that is statistics. When a sports commentator mentions a player’s “batting average” or “shooting percentage,” they are using the same tools you are about to learn.
Here is the thing: you already have good intuitions about these concepts. You know that flipping a coin is a 50-50 chance. You know that if your five test scores are 70, 80, 80, 90, and 100, your “average” should be somewhere around 80. What we are doing in this lesson is giving you precise mathematical language and techniques to describe what you already understand intuitively.
Statistics is about summarizing and understanding data. Probability is about predicting what might happen. Together, they give you the power to make sense of uncertainty - to see patterns in noise and calculate chances with precision.
Core Concepts
Measures of Central Tendency: Finding the “Middle”
When you have a collection of numbers, one of the first questions to ask is: “What is a typical value?” There are three common answers to this question, each measuring the “center” of your data in a different way.
Mean (Average)
The mean is what most people think of as “the average.” You add up all the values and divide by how many there are:
$$\bar{x} = \frac{\text{sum of all values}}{\text{number of values}} = \frac{x_1 + x_2 + \cdots + x_n}{n}$$
The symbol $\bar{x}$ (read “x-bar”) represents the sample mean.
For example, if your data set is 10, 15, 20, 25, 30:
$$\bar{x} = \frac{10 + 15 + 20 + 25 + 30}{5} = \frac{100}{5} = 20$$
The mean is useful because it uses every data point. But it has a weakness: extreme values (called outliers) can pull it away from where most of the data actually lives.
Median
The median is the middle value when you arrange your data in order. If you have an odd number of values, it is the one right in the center. If you have an even number, it is the average of the two middle values.
For 10, 15, 20, 25, 30 (five values), the median is 20 - the third value.
For 10, 15, 20, 25 (four values), the median is $\frac{15 + 20}{2} = 17.5$ - the average of the second and third values.
The median is resistant to outliers. If that 30 were actually 3000, the mean would skyrocket to 612.4, but the median would still be 20.
Mode
The mode is simply the value that appears most frequently. In the data set 2, 3, 3, 3, 5, 7, the mode is 3 because it appears three times.
A data set can have:
- One mode (unimodal)
- Two modes (bimodal)
- More than two modes (multimodal)
- No mode at all (if every value appears the same number of times)
Measures of Spread: How Scattered Is the Data?
Knowing the center is not enough. Two data sets can have the same mean but look completely different. Consider:
- Data Set A: 49, 50, 51
- Data Set B: 0, 50, 100
Both have a mean of 50, but Data Set A is tightly clustered while Data Set B is spread out wildly. We need measures of spread (also called dispersion) to capture this difference.
Range
The simplest measure of spread is the range: the difference between the largest and smallest values.
$$\text{Range} = \text{Maximum} - \text{Minimum}$$
For Data Set A: Range = 51 - 49 = 2
For Data Set B: Range = 100 - 0 = 100
The range is easy to calculate but uses only two data points, so it can be misleading if there are outliers.
Variance
Variance measures how far, on average, each data point is from the mean. The formula for sample variance is:
$$\sigma^2 = \frac{\sum(x_i - \bar{x})^2}{n-1}$$
Here is what that formula is really saying:
- Find the mean $\bar{x}$
- For each data point, calculate how far it is from the mean $(x_i - \bar{x})$
- Square each of those distances (to make them all positive)
- Add up all the squared distances
- Divide by $n-1$ (one less than the number of data points)
Why square the differences? Because some differences are negative (values below the mean) and some are positive (values above the mean). If we just added them up, they would cancel out. Squaring makes everything positive.
Why divide by $n-1$ instead of $n$? This is a statistical adjustment called Bessel’s correction that gives a better estimate when working with samples from a larger population. For now, just know that we use $n-1$ for sample variance.
Standard Deviation
The standard deviation is simply the square root of the variance:
$$\sigma = \sqrt{\sigma^2} = \sqrt{\frac{\sum(x_i - \bar{x})^2}{n-1}}$$
Why take the square root? Because variance is in “squared units” - if your data is in inches, the variance is in square inches. The standard deviation brings us back to the original units, making it more interpretable.
Standard deviation tells you roughly how far a typical data point is from the mean. A small standard deviation means the data is tightly clustered; a large one means the data is spread out.
The Normal Distribution
Many real-world phenomena - heights, test scores, measurement errors, blood pressure - follow a bell-shaped pattern called the normal distribution (or Gaussian distribution). The curve is:
- Symmetric around the mean
- Highest at the center (the mean)
- Tails off gradually in both directions, approaching but never touching zero
In a normal distribution, the mean, median, and mode are all the same value, right at the center of the bell.
The Empirical Rule (68-95-99.7 Rule)
For normally distributed data, we know exactly how the data spreads around the mean:
- About 68% of the data falls within 1 standard deviation of the mean
- About 95% of the data falls within 2 standard deviations of the mean
- About 99.7% of the data falls within 3 standard deviations of the mean
This is incredibly useful. If test scores are normally distributed with mean 75 and standard deviation 8:
- 68% of scores are between $75 - 8 = 67$ and $75 + 8 = 83$
- 95% of scores are between $75 - 16 = 59$ and $75 + 16 = 91$
- 99.7% of scores are between $75 - 24 = 51$ and $75 + 24 = 99$
Z-Scores: Standardizing Data
A Z-score tells you how many standard deviations a value is from the mean:
$$Z = \frac{x - \mu}{\sigma}$$
where $x$ is the data value, $\mu$ is the mean, and $\sigma$ is the standard deviation.
A Z-score of:
- 0 means the value equals the mean
- 1 means the value is 1 standard deviation above the mean
- -2 means the value is 2 standard deviations below the mean
Z-scores let you compare values from different distributions. If you scored 85 on a test with mean 75 and SD 8 (Z = 1.25), and 90 on a test with mean 80 and SD 5 (Z = 2.0), your second performance was actually more impressive relative to the class.
Basic Probability
Probability is the mathematical study of chance and uncertainty. When we assign a probability to an event, we are quantifying how likely it is to occur.
Sample Space and Events
The sample space (denoted $S$) is the set of all possible outcomes of an experiment. For example:
- Flipping a coin: $S = {H, T}$
- Rolling a die: $S = {1, 2, 3, 4, 5, 6}$
- Drawing a card: $S = {\text{all 52 cards}}$
An event is a subset of the sample space - any collection of outcomes we care about. For rolling a die:
- “Rolling an even number” is the event ${2, 4, 6}$
- “Rolling less than 3” is the event ${1, 2}$
Calculating Probability
For equally likely outcomes, the probability of an event $A$ is:
$$P(A) = \frac{\text{number of favorable outcomes}}{\text{total number of outcomes}}$$
Key facts about probability:
- $0 \leq P(A) \leq 1$ for any event $A$
- $P(\text{impossible event}) = 0$
- $P(\text{certain event}) = 1$
- $P(\text{not } A) = 1 - P(A)$ (the complement rule)
For rolling a die, $P(\text{even}) = \frac{3}{6} = \frac{1}{2}$ because 3 out of 6 outcomes are even.
Compound Events: AND and OR
AND (Intersection)
$P(A \text{ and } B)$ is the probability that both events occur.
For independent events (where one does not affect the other): $$P(A \text{ and } B) = P(A) \times P(B)$$
For example, the probability of flipping heads AND rolling a 6: $$P(H \text{ and } 6) = \frac{1}{2} \times \frac{1}{6} = \frac{1}{12}$$
OR (Union)
$P(A \text{ or } B)$ is the probability that at least one of the events occurs.
$$P(A \text{ or } B) = P(A) + P(B) - P(A \text{ and } B)$$
We subtract $P(A \text{ and } B)$ to avoid counting outcomes that satisfy both events twice.
For mutually exclusive events (events that cannot both happen): $$P(A \text{ or } B) = P(A) + P(B)$$
For example, when rolling a die, $P(\text{1 or 2}) = \frac{1}{6} + \frac{1}{6} = \frac{2}{6} = \frac{1}{3}$ because you cannot roll both a 1 and a 2 on a single roll.
Permutations and Combinations
Sometimes we need to count how many ways something can happen. This is where permutations and combinations come in.
Factorial
The factorial of $n$ (written $n!$) is the product of all positive integers from 1 to $n$:
$$n! = n \times (n-1) \times (n-2) \times \cdots \times 2 \times 1$$
For example:
- $5! = 5 \times 4 \times 3 \times 2 \times 1 = 120$
- $3! = 3 \times 2 \times 1 = 6$
- $1! = 1$
- $0! = 1$ (by definition)
Permutations: Order Matters
A permutation counts the number of ways to arrange objects when order matters. The number of ways to arrange $r$ objects chosen from $n$ objects is:
$$_nP_r = \frac{n!}{(n-r)!}$$
For example, how many ways can 3 runners finish in 1st, 2nd, and 3rd place from a race of 10? $$_{10}P_3 = \frac{10!}{7!} = \frac{10 \times 9 \times 8 \times 7!}{7!} = 10 \times 9 \times 8 = 720$$
Combinations: Order Does Not Matter
A combination counts the number of ways to choose objects when order does not matter. The number of ways to choose $r$ objects from $n$ objects is:
$$_nC_r = \binom{n}{r} = \frac{n!}{r!(n-r)!}$$
For example, how many ways can you choose 3 people from a group of 10 for a committee (where roles are equal)? $$_{10}C_3 = \binom{10}{3} = \frac{10!}{3! \cdot 7!} = \frac{10 \times 9 \times 8}{3 \times 2 \times 1} = \frac{720}{6} = 120$$
The key difference: Choosing Alice, Bob, and Carol for a committee is the same as choosing Carol, Bob, and Alice (combination). But Alice winning gold, Bob winning silver, and Carol winning bronze is different from Carol winning gold, Bob winning silver, and Alice winning bronze (permutation).
Binomial Probability
Binomial probability applies when you have:
- A fixed number of trials ($n$)
- Each trial has exactly two outcomes (success or failure)
- The probability of success ($p$) is the same for each trial
- Trials are independent
The probability of getting exactly $k$ successes in $n$ trials is:
$$P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}$$
This formula combines:
- $\binom{n}{k}$: the number of ways to arrange $k$ successes among $n$ trials
- $p^k$: the probability of $k$ successes
- $(1-p)^{n-k}$: the probability of the remaining $n-k$ failures
For example, if you flip a fair coin 6 times, what is the probability of getting exactly 4 heads?
- $n = 6$, $k = 4$, $p = 0.5$
$$P(X = 4) = \binom{6}{4} (0.5)^4 (0.5)^2 = 15 \times 0.0625 \times 0.25 = 0.234375$$
There is about a 23.4% chance of getting exactly 4 heads.
Notation and Terminology
| Term | Meaning | Example |
|---|---|---|
| $\bar{x}$ | Sample mean | $\bar{x} = \frac{\sum x_i}{n}$ |
| $\sigma$ | Standard deviation | Measures spread from the mean |
| $\sigma^2$ | Variance | Standard deviation squared |
| $P(A)$ | Probability of event A | $0 \leq P(A) \leq 1$ |
| $n!$ | n factorial | $5! = 120$ |
| $_nP_r$ | Permutations | Order matters |
| $_nC_r$ or $\binom{n}{r}$ | Combinations | Order does not matter |
| Z-score | $(x - \mu)/\sigma$ | How many SDs from mean |
Examples
Find the mean, median, and mode of the data set: 12, 15, 15, 18, 20, 25
Solution:
Mean: $$\bar{x} = \frac{12 + 15 + 15 + 18 + 20 + 25}{6} = \frac{105}{6} = 17.5$$
Median: The data is already in order. With 6 values (even number), the median is the average of the 3rd and 4th values: $$\text{Median} = \frac{15 + 18}{2} = \frac{33}{2} = 16.5$$
Mode: The value 15 appears twice, more than any other value. $$\text{Mode} = 15$$
Interpretation: These three measures give different perspectives on the “center.” The mean (17.5) is pulled up slightly by the higher values. The median (16.5) tells us half the values are below 16.5 and half are above. The mode (15) is the most common value.
Calculate $5!$ and $_5P_2$.
Solution:
Factorial: $$5! = 5 \times 4 \times 3 \times 2 \times 1 = 120$$
Permutation: $$_5P_2 = \frac{5!}{(5-2)!} = \frac{5!}{3!} = \frac{120}{6} = 20$$
Alternatively, you can think of it as: $5 \times 4 = 20$ (choosing 2 items from 5, where order matters: 5 choices for the first item, then 4 remaining choices for the second).
Real-world context: If 5 people are running for class president and vice president, there are 20 different ways the election could turn out, because president-Alice-VP-Bob is different from president-Bob-VP-Alice.
Calculate the standard deviation of the data set: 4, 8, 6, 5, 3, 2, 8
Solution:
Step 1: Find the mean. $$\bar{x} = \frac{4 + 8 + 6 + 5 + 3 + 2 + 8}{7} = \frac{36}{7} \approx 5.14$$
Step 2: Find each deviation from the mean and square it.
| $x_i$ | $x_i - \bar{x}$ | $(x_i - \bar{x})^2$ |
|---|---|---|
| 4 | 4 - 5.14 = -1.14 | 1.30 |
| 8 | 8 - 5.14 = 2.86 | 8.18 |
| 6 | 6 - 5.14 = 0.86 | 0.74 |
| 5 | 5 - 5.14 = -0.14 | 0.02 |
| 3 | 3 - 5.14 = -2.14 | 4.58 |
| 2 | 2 - 5.14 = -3.14 | 9.86 |
| 8 | 8 - 5.14 = 2.86 | 8.18 |
Step 3: Sum the squared deviations. $$\sum(x_i - \bar{x})^2 = 1.30 + 8.18 + 0.74 + 0.02 + 4.58 + 9.86 + 8.18 = 32.86$$
Step 4: Calculate variance (divide by $n - 1 = 6$). $$\sigma^2 = \frac{32.86}{6} \approx 5.48$$
Step 5: Take the square root for standard deviation. $$\sigma = \sqrt{5.48} \approx 2.34$$
Interpretation: On average, the data points are about 2.34 units away from the mean of 5.14.
In how many ways can 3 people be chosen from a group of 10 to form a committee?
Solution:
Since the committee members all have equal roles (no president, secretary, etc.), order does not matter. This is a combination problem.
$$_{10}C_3 = \binom{10}{3} = \frac{10!}{3!(10-3)!} = \frac{10!}{3! \cdot 7!}$$
We can simplify by expanding only the parts we need: $$= \frac{10 \times 9 \times 8 \times 7!}{3! \times 7!} = \frac{10 \times 9 \times 8}{3 \times 2 \times 1} = \frac{720}{6} = 120$$
There are 120 different ways to choose a 3-person committee from 10 people.
Why combinations and not permutations? If we chose Alice, Bob, and Carol - that is the same committee whether we list them as “Alice, Bob, Carol” or “Carol, Alice, Bob.” Since order does not matter, we use combinations.
Test scores in a class are normally distributed with a mean of 75 and a standard deviation of 8. What percent of students score above 91?
Solution:
Step 1: Calculate the Z-score for 91. $$Z = \frac{x - \mu}{\sigma} = \frac{91 - 75}{8} = \frac{16}{8} = 2$$
A score of 91 is exactly 2 standard deviations above the mean.
Step 2: Use the empirical rule.
The empirical rule tells us that 95% of data falls within 2 standard deviations of the mean. This means:
- 95% of scores are between 59 and 91 (within 2 SDs)
- 5% of scores are outside this range
- Since the normal distribution is symmetric, 2.5% are below 59 and 2.5% are above 91
Answer: About 2.5% of students score above 91.
Alternative interpretation: Scoring 91 puts you in approximately the 97.5th percentile - better than about 97.5% of the class.
A fair coin is flipped 6 times. What is the probability of getting exactly 4 heads?
Solution:
This is a binomial probability problem:
- $n = 6$ trials (flips)
- $k = 4$ successes (heads)
- $p = 0.5$ (probability of heads on each flip)
- $1 - p = 0.5$ (probability of tails)
Step 1: Calculate the number of ways to arrange 4 heads among 6 flips. $$\binom{6}{4} = \frac{6!}{4! \cdot 2!} = \frac{6 \times 5}{2 \times 1} = \frac{30}{2} = 15$$
Step 2: Apply the binomial probability formula. $$P(X = 4) = \binom{6}{4} \times p^4 \times (1-p)^{6-4}$$ $$= 15 \times (0.5)^4 \times (0.5)^2$$ $$= 15 \times 0.0625 \times 0.25$$ $$= 15 \times 0.015625$$ $$= 0.234375$$
Answer: The probability of getting exactly 4 heads is $\frac{15}{64} \approx 0.234$ or about 23.4%.
Understanding the formula: There are 15 different ways the 4 heads could be arranged among the 6 flips (HHHHTT, HHHTHH, HHTHHT, etc.). Each specific sequence of 4 heads and 2 tails has probability $(0.5)^6 = 0.015625$. Multiplying gives $15 \times 0.015625 = 0.234375$.
Key Properties and Rules
Properties of Mean, Median, and Mode
- Mean is affected by every data point and sensitive to outliers
- Median is resistant to outliers and based only on position
- Mode can be used with non-numerical (categorical) data
- For symmetric distributions, mean = median = mode
- For skewed distributions, the mean is pulled toward the tail
Standard Deviation Properties
- Standard deviation is always non-negative ($\sigma \geq 0$)
- $\sigma = 0$ only when all data points are identical
- Adding a constant to all data points does not change the standard deviation
- Multiplying all data points by a constant multiplies the standard deviation by that constant
Probability Rules
- $0 \leq P(A) \leq 1$ for any event
- $P(\text{not } A) = 1 - P(A)$ (Complement Rule)
- $P(A \text{ or } B) = P(A) + P(B) - P(A \text{ and } B)$ (Addition Rule)
- For independent events: $P(A \text{ and } B) = P(A) \times P(B)$ (Multiplication Rule)
Counting Formulas
| Situation | Formula | Order Matters? |
|---|---|---|
| Factorial | $n! = n \times (n-1) \times \cdots \times 1$ | N/A |
| Permutation | $_nP_r = \frac{n!}{(n-r)!}$ | Yes |
| Combination | $_nC_r = \frac{n!}{r!(n-r)!}$ | No |
Empirical Rule (68-95-99.7)
For normally distributed data:
- 68% within 1 standard deviation: $\mu \pm \sigma$
- 95% within 2 standard deviations: $\mu \pm 2\sigma$
- 99.7% within 3 standard deviations: $\mu \pm 3\sigma$
Real-World Applications
Quality Control
Manufacturing companies use statistics constantly. If a machine produces bolts that should be 10mm in diameter with a standard deviation of 0.1mm, quality control can use the empirical rule: bolts outside 3 standard deviations (below 9.7mm or above 10.3mm) are rejected - that is only about 0.3% of production under normal conditions.
Medical Testing
When doctors order blood tests, the results are compared to “normal ranges” - these are essentially intervals based on mean and standard deviation from healthy populations. A Z-score helps determine whether your result is unusual enough to warrant concern.
Weather Forecasting
“A 30% chance of rain” is probability in action. Meteorologists use statistical models based on historical data and current conditions to calculate these probabilities. They are essentially asking: “Given similar conditions in the past, what fraction of the time did it rain?”
Sports Analytics
Baseball’s batting average is just a probability: hits divided by at-bats. Basketball uses shooting percentages. Advanced analytics use more sophisticated statistics - like standard deviation to measure consistency. A player who averages 20 points with low standard deviation is more reliable than one who averages 20 points but swings between 5 and 35.
Lottery and Games
Casino games and lotteries are designed using probability theory. The house always has a mathematical edge. Understanding probability helps you see why: if you bet on a single number in roulette, you have a 1/38 chance of winning, but the payout is only 35 to 1 - the difference is the casino’s profit margin.
Survey Analysis
When a poll reports “52% favor the candidate with a margin of error of 3%,” that margin of error comes from statistical theory. It tells you the true population percentage is likely between 49% and 55% - using concepts related to standard deviation and the normal distribution.
Self-Test Problems
Problem 1: Find the mean, median, and mode of: 5, 7, 7, 9, 11, 13
Show Answer
Mean: $$\bar{x} = \frac{5 + 7 + 7 + 9 + 11 + 13}{6} = \frac{52}{6} \approx 8.67$$
Median: With 6 values, the median is the average of the 3rd and 4th values: $$\text{Median} = \frac{7 + 9}{2} = 8$$
Mode: 7 appears twice, more than any other value. $$\text{Mode} = 7$$
Problem 2: Calculate $6!$ and $_{6}C_2$.
Show Answer
Factorial: $$6! = 6 \times 5 \times 4 \times 3 \times 2 \times 1 = 720$$
Combination: $$_6C_2 = \frac{6!}{2!(6-2)!} = \frac{6!}{2! \cdot 4!} = \frac{6 \times 5}{2 \times 1} = \frac{30}{2} = 15$$
Problem 3: A data set has mean 50 and standard deviation 10. What is the Z-score of a value of 35?
Show Answer
$$Z = \frac{x - \mu}{\sigma} = \frac{35 - 50}{10} = \frac{-15}{10} = -1.5$$
A value of 35 is 1.5 standard deviations below the mean.
Problem 4: A bag contains 4 red marbles and 6 blue marbles. If you draw one marble, what is the probability of drawing a red marble? A blue marble?
Show Answer
Total marbles: $4 + 6 = 10$
$$P(\text{red}) = \frac{4}{10} = \frac{2}{5} = 0.4 = 40%$$
$$P(\text{blue}) = \frac{6}{10} = \frac{3}{5} = 0.6 = 60%$$
Note: $P(\text{red}) + P(\text{blue}) = 0.4 + 0.6 = 1$, as it should since these are the only possibilities.
Problem 5: How many different 4-letter arrangements can be made from the letters A, B, C, D, E if each letter can only be used once?
Show Answer
Since order matters (ABCD is different from DCBA), this is a permutation.
$$_5P_4 = \frac{5!}{(5-4)!} = \frac{5!}{1!} = 5! = 120$$
Alternatively: $5 \times 4 \times 3 \times 2 = 120$ (5 choices for first letter, 4 for second, 3 for third, 2 for fourth).
There are 120 different arrangements.
Problem 6: In a normally distributed data set with mean 100 and standard deviation 15, what percentage of values fall between 70 and 130?
Show Answer
First, find how many standard deviations 70 and 130 are from the mean:
For 70: $Z = \frac{70 - 100}{15} = \frac{-30}{15} = -2$
For 130: $Z = \frac{130 - 100}{15} = \frac{30}{15} = 2$
Both boundaries are exactly 2 standard deviations from the mean.
By the empirical rule, 95% of values fall within 2 standard deviations of the mean.
Problem 7: A basketball player makes 70% of her free throws. If she takes 5 free throws, what is the probability she makes exactly 3?
Show Answer
This is a binomial probability problem with $n = 5$, $k = 3$, $p = 0.7$.
$$P(X = 3) = \binom{5}{3} \times (0.7)^3 \times (0.3)^2$$
$$\binom{5}{3} = \frac{5!}{3! \cdot 2!} = \frac{5 \times 4}{2} = 10$$
$$P(X = 3) = 10 \times 0.343 \times 0.09 = 10 \times 0.03087 = 0.3087$$
The probability is approximately 30.9% or about $\frac{31}{100}$.
Summary
-
Measures of central tendency describe the “middle” of a data set:
- Mean ($\bar{x}$): the arithmetic average; sensitive to outliers
- Median: the middle value when data is ordered; resistant to outliers
- Mode: the most frequently occurring value
-
Measures of spread describe how scattered the data is:
- Range: maximum minus minimum
- Variance ($\sigma^2$): average of squared deviations from the mean
- Standard deviation ($\sigma$): square root of variance; in original units
-
The normal distribution is a bell-shaped curve where about 68% of data falls within 1 SD, 95% within 2 SDs, and 99.7% within 3 SDs of the mean.
-
A Z-score measures how many standard deviations a value is from the mean: $Z = \frac{x - \mu}{\sigma}$
-
Probability quantifies likelihood, always between 0 and 1.
- Complement rule: $P(\text{not } A) = 1 - P(A)$
- For independent events: $P(A \text{ and } B) = P(A) \times P(B)$
- Addition rule: $P(A \text{ or } B) = P(A) + P(B) - P(A \text{ and } B)$
-
Permutations count arrangements where order matters: $_nP_r = \frac{n!}{(n-r)!}$
-
Combinations count selections where order does not matter: $_nC_r = \frac{n!}{r!(n-r)!}$
-
Binomial probability gives the chance of exactly $k$ successes in $n$ trials: $P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}$
These tools give you the power to summarize data meaningfully, understand variation, and calculate the likelihood of uncertain events. Whether you are analyzing test scores, evaluating risks, or just trying to figure out your chances in a game, probability and statistics provide the mathematical framework for reasoning about uncertainty.