Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
68 Cards in this Set
- Front
- Back
Values of one variable tend to occur with certain values of another variable; detected when the conditional distribution differ from the marginal distribution and from each other.
|
association
|
|
A condition where th mean of the statistic values differs from the parameter that the statistic estimates
|
bias
|
|
The name of the statement telling us that the sampling distribution of x-bar is approximately normal whenever the sample is large and random.
|
Central Limit Theorem
|
|
The distribution of the values in a single row (or column) of a two-way table.
|
conditional distribution
|
|
a statistical tool for monitoring the input or output of a process
|
control chart
|
|
+/- 3 (sigma/root(N))
|
control limits
|
|
a measure of the strength of the linear relationship between two quantitative variables
|
correlationcoefficient
|
|
events that cannot occur simultaneously
|
disjoint events
|
|
A list of the possible values of a variable together with the frequency. (probabilities can be substituted for frequency)
|
distribution of variable
|
|
A single outcome or a combination of outcomes from a random phenomenon.
|
event
|
|
Predicting a Y value using a value of X that is outside of the range of X values used to obtain the regression equation when it is included in the computations. This prediction could be very far off.
|
extrapolation
|
|
Results from statistical analyses performed on non- random samples.
|
garbage
|
|
Using results from a sample statistics value to draw conclusions about the population parameter.
|
inference
|
|
An observation that substantially alters the value of slope and y-intercept in the regression equation when it is included in the computations
|
influential observation
|
|
The fact that the average of observed values in a sample will get closer and closer to mu as the sample size increases.
|
law of large numbers
|
|
The basis for hypothesis testing and confidence interval estimation
|
laws of probability
|
|
A method for finding the equation of a line that minimizes the sum of squared residuals
|
least squares
|
|
The line with the smallest sum of squared residuals
|
least squares regression line
|
|
a variable that is not measured but explains association between two variables that are measured.
|
lurking variable
|
|
The distribution of the values in the "total" row (or the "total" column) of a two-way table.
|
marginal distribution
|
|
the mean of all the sample means from all possible samples of size n from a population; equals mu
|
mean of the sampling distribution of x bar
|
|
the mean of the population
|
mu
|
|
a condition where values of one variable occur independent of values of another variable; detected when the conditionals of a two-way table equal the marginal distribution(and each other)
|
no association
|
|
the difference between the observed statistic and the claimed parameter value
|
observed effect
|
|
one sample mean outside the 3 standard deviation of population, proportion of a population, respectively
|
out-of-control process
|
|
An observation that falls outside the overall pattern of the data set.
|
outlier
|
|
A characteristic of a population that is usually unknown; this could be mean, median, proportion, standard deviation computed on all the data from the population.; a parameter does not have variability
|
parameter
|
|
parameter symbols
|
mu, sigma, p (mean of pop, standard dev. of pop, proportion of a pop., respectively)
|
|
High values of one variable tend to associate with high values of another variable.
|
positive association
|
|
A measure of the proportion of times an outcome occurs in a very long series of repetitions that gives us an indication of the likelihood of the outcome.
|
probability of an outcome
|
|
Sequence of operations used in production, manufacturing, etc
|
process
|
|
A process whose inputs and outputs exhibits natural variation when observed over time
|
process in statistical control
|
|
A chart plotting the means x-bar of regular samples of size n against time; this chart is used to asses whether the process is in control
|
quality control chart
|
|
The type of data required for regression analysis
|
quantitative bivariate
|
|
the symbol for correlation coefficient
|
r
|
|
the percentage of total variation in the response variable, Y, that is explained by the regression equation; in other words, the percentage of total variation in the response variable, Y, that is explained by the explanatory variable, X
|
r^2
|
|
A phenomenon that describes the uncertainty of individual outcomes but gives a regular distribution of the outcomes in the long run
|
random
|
|
A formula for a line that models a linear relationship between two quantitative variables.
|
regression equation
|
|
The observed y minus the predicted y; denoted: y-(yhat)
|
residual
|
|
a diagnostic plot of the explanatory variable versus the residuals used to asseshow well the regression line fits the data; complete scatter in a shoe box pattern is good whereas a megaphone pattern denotes unequal variance in Y=s across all levels of X and curvature in the form of a smile or a frown denotes that the linear model is not best for that data.
|
residual plot
|
|
The random variable of the sampling distribution of x bar
|
sample mean
|
|
The list of all possible outcomes of a random phenomenon
|
sample space
|
|
A distribution of a stat.; a list of all the possible values of a statistic together with the frequency (or probability) of each value.
|
sampling distribution
|
|
A list of all possible values for x-bar together with the frequency (or probability) of each value; in other words, the distribution of all x-bar's from all possible samples.
|
sampling distribution of x-bar
|
|
The variability of sample results from one sample tothe next; something we must measure in order to effectively do inference.
|
sampling variability
|
|
A two dimensional plot used to examine strength of relationship between two variables as well as direction and type of relationship.
|
scatterplot
|
|
a condition where the percentages reverse when a third (lurking) variable is ignored; in other words, a condition leading to misinterpretation of the direction of association between two variables caused by ignoring a third variable that is associated with both of the reported variables.
|
Simpson's paradox
|
|
Using random numbers to imitate chance behavior
|
simulation
|
|
A measure of the average change in the response variable for every one unit increase in the explanatory or independent variable.
|
slope
|
|
A measure of the variability of the data
|
standard deviation
|
|
standard deviation of x-bar
|
a measure of the variability of the values of the statistic x-bar about mu; a measure of the variability of the sampling distribution of x-bar; in other words, the average amount that the statistic, x-bar, deviates from its associated parameter. Computed(sigma/root(n))
|
|
A number computed from sample data (without any knowledge of the value of a parameter) used to estimate the value of the parameter.
|
statistic
|
|
x-bar, s, p-hat
|
statistic symbols
|
|
A procedure used to check a process a regular intervals to detect problems and correct them before they become serious
|
statistical process control
|
|
Results of a study that differ too much from what we expected because of randomization to attribute to chance variation; the difference between the obseerved statistic and the claimed parameter value is too large to be due to chance. Results are classified as statistically significant when P-value>alpha
|
statistically significant
|
|
the residuals are squared and added; denoted SSE
|
sum of squared residuals (or error)
|
|
Procedure used to asses the evidence against a claim (hypothesis) about the value of a parameter
|
test of significance
|
|
A number that summarizes the data for a test of significance, usually used to obtain P-value.
|
test statisitic
|
|
The sum of the squared deviations of the Y observations about their mean, Y-bar.
|
total variation in Y
|
|
A table containing counts for two categorical variables. It has r rows and c columns.
|
two-way table
|
|
unbiased
|
a condition where the mean of the statistic values equals the parameter that the statistic estimates.
|
|
The sum of squared residuals
|
unexplained variation
|
|
the symbol for the explanatory variable
|
X
|
|
A plot of sample means over time used to asses whether a process is in control
|
x-bar chart
|
|
The symbol for response variable
|
Y
|
|
the symbol for predicted y
|
y-hat
|
|
The z-value obtained from table C corresponding to a desired level of confidence; used in computing a confidence interval
|
z
|
|
z-score
|
a measure of the number of standard deviations of a value or observation from the mean
|