Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
30 Cards in this Set
- Front
- Back
First things you should ask yourself before solving a statistics problem... |
1. Who - the individuals, how many. 2. What - the variables, exact definitions of them, and the unit of measurement. 3. When 4. Where 5. Why - what purpose does the data have? |
|
Categorical variable |
Places an individual into one of several groups, or categories. |
|
Quantitative variable |
Takes numerical values for which arithmetic operations such as adding, and averaging make sense. (usually recorded with a unit of measurement) |
|
Distribution |
The distribution of a variable tells us what values it takes and how often it takes these values |
|
Distribution of a Categorical Variable |
Lists the categories and gives either the count or the percent of individuals who fall into each category. |
|
Variablility |
Spread |
|
Outlier |
Falls outside the overall pattern |
|
One way to describe the center of a distribution is by its... |
Midpoint (cross off smallest to largest) |
|
Skewed Right |
Look at notes |
|
Skewed Left |
Look at notes |
|
Time plot |
A time plot of a variable plots each observation against the time at which it was measured. *Always put time on the horizontal scale of your plot and the variable you are measuring on the vertical scale. |
|
Trend |
Overall pattern on a time plot, and is a long-term upward or downward movement over time. |
|
Cross-sectional data |
-A histogram displays |
|
Most common measure of center |
Mean |
|
Mean |
To find the mean of a set of observations, add their values and divide by the number of observations. |
|
Median |
The formal version of the midpoint (half the observations are smaller than, and the other half are larger than) -arrange from smallest to largest (if odd, the number is in the very middle) |
|
Quartiles |
The middle half of the data |
|
First quartile |
First quarter 25% |
|
Second quartile |
50% |
|
Third quartile |
75% |
|
To calculate the quartiles |
1. Arrange the observations in increasing order and locate the median, M, in the ordered list of observations. 2. The first Q1 is the median of the observations whose position in the ordered list is to the left of the location of the overall median. 3. The third Q3 is the median of the observations whose position in the ordered list is to the right of the location of the overall median. |
|
Five number summary |
Consists of the smallest observation, the first quartile, the median, the third quartile, and the largest observation, written in order from smallest to largest. |
|
Box Plot |
A box plot is a graph of the five-number summary. - a central box spans the quartiles Q1 and Q3 - a line in the box marks the median M - lines extend from the box out to the smallest, and largest observations. (best used for side by side comparison) *Outliers are marked as dots |
|
the Interquartile Range (IQR) |
IQR= Q3 - Q1 *rule of thumb for identifying outliers |
|
To identify outliers.. (the 1.5 x IQR rule) |
Q1- (1.5 x IQR) = X If it falls more than 1.5 x IQR above the third quartile or below the first quartile then it is a suspected outlier. Q3 + (1.5 x IQR) = Y |
|
Standard deviation (s) and Variance (s^2) |
or more compactly |
|
Standard Deviation (s) is the square root of the variance (s^2) |
|
|
Degrees of freedom |
The number n-1 is called the degrees of freedom of the variance of standard deviation. The number of values in the final calculation of a statistic that are free to vary. The number of independent ways by which a dynamic system can move, without violating any constraint imposed on it |
|
S = 0 |
NO variability |
|
So far, we have a choice between two descriptions of the center and variability of a distribution.... |
The five-number summary or The mean (x-bar), and the standard deviation (s) *These are both sensitive to extreme observations. |