Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
74 Cards in this Set
- Front
- Back
If two independent large samples are taken from two populations, the sampling distribution of the difference between the two sample means
|
can be approximated by a normal distribution
|
|
An estimate of the variance of a population based on the combination of two sample results is known as the
|
pooled variance estimate
|
|
The pooled variance is appropriate whenever the two populations
|
are normally distributed and have equal variances
|
|
When each data value in one sample is matched with a corresponding data value in another sample, the samples are known as
|
matched samples
|
|
In an analysis of variance, one estimate of σ2 is based upon the differences between the treatment means and the
|
overall sample mean
|
|
The F ratio in a completely randomized ANOVA is the ratio of
|
MSR/MSE
|
|
The variable of interest in an ANOVA procedure is called
|
a factor
|
|
In the ANOVA, treatment refers to
|
different levels of a factor
|
|
An experimental design where the experimental units are randomly assigned to the treatments is known as
|
completely randomized design
|
|
The EPA takes a random sample of 6 readings for each city and find the mean and variance. The populations are assumed to be normal with unknown but equal variances. The EPA should use a
|
t test for difference in two means with independent data (pooled t test)
|
|
the mean square is the sum of squares divided by
|
its corresponding degrees of freedom
|
|
in factorial designs, the response produced when the treatments of one factor interact with the treatments of another in influencing the response variable is known as
|
interaction
|
|
the number of times each experimental condition is observed in a factorial design is known as
|
replication
|
|
assumptions of analysis of variance do not include
|
equality of means
|
|
when variances of two independent samples are combined and S^2 is computed, the S^2 is referred to as
|
the pooled estimator of o^2
|
|
the ANOVA procedure is a statistical approach for determining whether or not the means of
|
two or more populations are equal
|
|
in testing the difference between the means of two normally distributed populations using independent random samples, the pooled estimate of the variance is used if
|
population variances are assumed to be equal
|
|
Regression analysis is a statistical procedure for developing a mathematical equation that describes how
|
one dependent and one or more independent variables are related
|
|
A procedure used for finding the equation of a straight line which provides the best approximation for the relationship between the independent and dependent variables is the
|
least squares method
|
|
Application of the least squares method results in values of the y intercept and the slope which minimizes the sum of the squared deviations between the
|
observed values of the dependent variable and the estimated values of the dependent variable
|
|
Larger values of r2 imply that the observations are more closely grouped about the
|
least squares line
|
|
In a regression model involving more than one independent variable, which of the following tests must be used in order to determine if the relationship between the dependent variable and the set of independent variables is significant?
|
F test
|
|
If the coefficient of determination is equal to 1, then the coefficient of correlation
|
can be -1 or +1
|
|
If the coefficient of determination is a positive value, then the regression equation
|
could have either a positive or a negative slope
|
|
If the coefficient of correlation is 0.8, the percentage of variation in the dependent variable explained by the variation in the independent variable is
|
64%
|
|
In regression analysis, if the dependent variable is measured in dollars, the independent variable
|
can be any units
|
|
If the coefficient of correlation is a positive value, then the slope of the regression line
|
must also be positive
|
|
If the coefficient of correlation is a negative value, then the coefficient of determination
|
must be positive
|
|
If all the points of a scatter diagram lie on the least squares regression line, then the coefficient of determination for these variables based on this data is
|
1
|
|
A multiple regression model has
|
more than one independent variable
|
|
Compared to the confidence interval estimate for a particular value of y (in a linear regression model), the interval estimate for an average value of y will be
|
narrower
|
|
A variable that can not be measured in terms of how much or how many but instead is assigned values to represent categories is called
|
a qualitative variable
|
|
the error term is the difference between an individual value of the dependent variable and the corresponding mean value of the dependent variable...t/f
|
false
|
|
the point estimate of the variance in a regression model is
|
MSE
|
|
what measures the strength of the linear relationship between the dependent and the independent variable
|
correlation coefficient
|
|
what is a violation of one of the major assumptions of the simple regression model?
|
as the value of x increases, the value of the error term also increases
|
|
the relationship between the dependent variable and the independent variable is stronger when the r^2 is __ and the s (standard error) is __
|
higher, lower
|
|
in a regression model, a value of the error term depends upon other values of the error term...t/f
|
false
|
|
when using a multiple regression model, we assume that error terms (residuals) are distributed according to
|
normal distribution
|
|
the multiple coefficient of determination is the __ divided by the total variation
|
explained variation
|
|
the residual is the difference between the observed value of the dependent variable and the predicted value of the dependent variable...t/f
|
true
|
|
in a simple linear regression model, the coefficient of determination not only indicates the strength of the relationship between independent and dependent variable, but also shows whether the relationship is positive or negative...t/f
|
false
|
|
when using simple linear regression, we would like to use confidence intervals for the __ and prediction intervals for the __ at a given value of x
|
mean y-value, individual y-value
|
|
in a simple linear regression analysis, the correlation coefficient and the slope always have the same sign...t/f
|
true
|
|
what are assumptions of the error terms in a simple linear regression model
|
errors are normally distributed, error terms have a mean of zero and have a constant variance
|
|
if all data points fell on a straight line, SSE would equal __ and r would equal __
|
0, -1 or 1
|
|
a t-test is used when testing the significance of an individual independent variable...t/f
|
true
|
|
if it is desired to include marital status in a multiple regression model by using the categories: single, married, separated, divorced, widowed, what will be the effect on the model?
|
four more independent variables will be included
|
|
A nonparametric method for determining the differences between two populations based on two matched samples where only preference data is required is the
|
wilcoxon signed rank test
|
|
Statistical methods that require assumptions about the population are known as
|
parametric
|
|
The Spearman rank-correlation coefficient is
|
a correlation measure based on rank-ordered data for two variables
|
|
A parameter of the exponential smoothing model which provides the weight given to the most recent time series value in the calculation of the forecast value is known as the
|
smoothing constant
|
|
A goodness of fit test is always conducted as a
|
upper tail test
|
|
A statistical test conducted to determine whether to reject or not reject a hypothesized probability distribution for a population is known as a
|
goodness of fit test
|
|
The time series component, which reflects a regular, multi-year pattern of being above and below the trend line is
|
cyclical
|
|
The time series component that reflects variability due to natural disasters is called
|
irregular
|
|
the smoothing constant is a number that determines how much weight it is attached to each observation...t/f
|
true
|
|
the sampling distribution for a goodness of fit test is
|
the chi-square distribution
|
|
a restaurant has been experiencing higher sales during the weekends compared to the weekdays. Daily restaurant sales patterns for this restaurant over a week are an example of what component of a time series
|
seasonal
|
|
the time series component that reflects variability during a single year is called
|
seasonal
|
|
one use of the chi-square goodness of fit test is to determine if specified probabilities in the null hypothesis is correct...t/f
|
true
|
|
in a contingency table, when all the expected frequencies equal the observed frequencies the calculated chi-squared statistic equals 0...t/f
|
true
|
|
a group of observations measured at successive time intervals is known as
|
a time series
|
|
which nonparametric method requires that we carry out a paired difference experiment
|
wilcoxon signed rank test
|
|
one measure of the accuracy of a forecasting model is
|
the mean square error
|
|
when we carry out a chi-square test of independence, the expected frequencies are based on the null hypothesis...t/f
|
true
|
|
exponential smoothing is a forecasting method that applies equal weights to the time series observations...t/f
|
false
|
|
when deseasonalizing a time series observation the actual time series observation is divided by its seasonal factor...t/f
|
true
|
|
statistical methods that generally require very few, if any, assumptions about the population distribution are known as
|
nonparametric
|
|
the time series component that reflects gradual variability over a long time period is called
|
a trend
|
|
if data for a time series analysis is collected on an annual basis only, which component may be ignored
|
seasonal
|
|
when using the chi-square goodness of fit test, if the value of the chi-square statistic is large enough, we reject the null hypothesis...t/f
|
true
|
|
the level of measurement that is simply a label for the purpose of identifying an item is
|
nominal measurement
|
|
statistical methods that generally require the assumptions that population distributions are normal are
|
parametric
|