Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
60 Cards in this Set
- Front
- Back
What is the central limit theorem?
|
The central limit theorem:1.describes the relationship between sampling distribution m and population m. 2.shows as the sq root of sample increase, the sd of sample decreases 3. states that the sampling distribution tends to normally distributed if sample is of sufficient size.
|
|
What is sampling distribution?
|
A sampling distribution is the distribution of a sample statistic that would be obtained if all possible samples of the same size were drawn from given population.
|
|
Confidence interval?
|
A limit around a particular statistic that states that the population m would likely fall in that interval, 95% conf. interval will contain population parameter 95% of the time.
|
|
o Sample Size Issues/Power
|
Statistical power is the ability of a statistical test to detect relationships between variables.
Small sample sizes effectively reduce power and may make it very difficult to achieve statistical significance at even the .05 level |
|
Power is a result of 4 factors:
|
Power is a direct function of four variables:
a. alpha level (.05) b. sample size c. effect size-strength of relationship between 2+ variables d. type of statistical test. |
|
o “Statistical Significance”, p-values
|
the likelihood that sample chosen is different because of the ‘treatment” as opposed to chance , sample chosen before treatment is representative of the total population
|
|
o Parametric vs. Non-Parametric
|
Parametric: tests that require normally distributed populations
Non-parametric: if the samples do not meet the requirement for parametric |
|
o Assumptions (HOV, NORM, independence) & related tests
|
HOV= Homogeneity of Variance or homoscedasiticity=variance of both groups the same
NORM=normality of distribution Standard deviation and psd(pseudo stand. dev) are about the same- normal distribution. ( skew or kurtosis of distribution) Independence =Independence of observations, that one item selected has nothing to do with the other item being selected. **consequences are the level of significance and F test will be seriously compromised |
|
Interclass correlation coefiicient
|
Lack of independence is tested by the Interclass correlation coefiicient
**Stevens (1992) says this is the one of the most important assumptions that has to be met as it can totally impact power and lead to false results. |
|
Effect Size
|
Effect size is a function of the actual size of the relationship-effect of the relationship on one or more variables by another
Cohen’s suggested that effect size effect (Cohen’s d) be constructed based on the difference between experimental and control group means, expressed in SDunits. M (exp) – M (control) divided by SD |
|
Variance Partitioning
|
Dividing up the variance into the difference sources of variance
|
|
F-Ratio
|
F= between group variability divided by within group variabililty
|
|
Sums of Squares
|
Total SS= within group SS + between group SS
|
|
o Types of regression (sequential, standard, stepwise)
|
1. uniqueness regression=Standard regression
standard- is looking at each variable as if it would be last every thing put in as last, ( so nothing takes shared variance) so data driven- no recommended—can be used tosnoop data 2. hierachial/sequential- regression ( you enter the order) If you know your theory then you would use hierarchical or standard 3. data drive regression (stepwise)=order of entry |
|
o Mutlicollinearity
|
/ high correlation between independent variables= this can be a problem
|
|
discuss how score and category differ?
|
score has been produced using an interval scale of measurement
classification results in nominal(category) scaling |
|
what is meant by deviation scores?
|
how the scores vary from the mean
|
|
Sum of squares
|
squared deviation scores- thus it is the variance of the scores/data
|
|
degrees of freedom
|
number of scores free to take on a value in the range, always one score has no freedom
|
|
standard deviation
|
unsquare the sum of squares/df
if sum =10 /4 df=2.5, sq root of that number is 1.58=sd |
|
normal distribution
|
Most scores occur in the middle with decreasing frequencies for high and low
|
|
discuss differences in within-group and between group differences
|
within-variance on individuals within the group, between variance between two+ groups
|
|
what is the other term for within group
|
error
|
|
variances are also called
|
mean squares
|
|
partitioning refers to?
|
dividing up the total variability into its component
NOTE: Sum of squares add up to the total SS |
|
F-ratio
|
dividing the between variance by the within (error), The more it rises above 1 the more we can say there are between group difference
|
|
eta(squared)
|
dividing between/total SS then multiplied by 100 to give % of total variability
|
|
Wilks's lambda
|
ratio of within/SS-unexplained variance
eta and Wilks are mirror images of each other |
|
simple regression is used for what?
|
to analyze whether score differences on a dependent variable can by accounted for by scores on an indep. variable-use simple regression
|
|
bivariate scatter plot
|
summarizes the relationship between variables
dependent on Y axis and indep. on X axis |
|
Y intercept? on a regression line
|
The point where the regression line intercepts the Y axis
|
|
regression coefficients
|
slope and the Y intercept jointly define the regression coefficient
|
|
regression equation
|
summarizes the relationship between any X and Y variable
Y = slope(X) + Y intercept |
|
r (squared) or coefficient of determination
|
divide the regression SS by total SS
direct parallel to eta(Squared) can be multiplied by 100 to explain % of variability |
|
square root of r(squared)
|
Pearson's correlation coefficient or r
can be positive or negative |
|
discuss difference in types of information provided by regression and correlation coefficients
|
Regression: (the slope) indicates how the indep. variable accounts for group differences in the scores
Correlation coef. accounts for how far individual differences can be accounted for. |
|
difference between regression/correlation (in terms of scatterplot)
|
regression is the slope (line)-higher value steeper line
correlation is how tightly the data points cluster around the line higher value-tighter distribution of data dots |
|
categorical scale
|
category- male female
|
|
ordinal scale
|
survey 1-5 least to most
|
|
interval scale
|
distance between each number is equal and continous scoring
|
|
Reliability of measurement refers to
|
how far the data are contaminated with random erros that make them inconsistent
|
|
Validity of measurement refers
|
to how far the data are subject to systematic errors or bias that makes them inaccurate
|
|
internal consistency
|
how well elements of the measure operate in concert
|
|
Cronbach's alpha
|
correlating the item scores for a sample of respondents, .7 min. reassurance of internal consistency, higher is better
|
|
test-retest reliability values
|
same respondents tested twice- look for consistency of response
|
|
content validity
|
does it test the important facets of the content
|
|
Characteristics of the population can be described with summary numbers that are called___________
|
parameters
|
|
Null hypothesis is either __________ or ______________
|
accepted or rejected
|
|
The ____states that the mean difference of the groups being tested is zero (refers to parameters)
|
null hypothesis
|
|
the other side of the null is
|
the alternate or research hypothesis
|
|
what is meant by the two tailed, form of the alternative hypothesis?
|
that the direction of the difference is not specified at a higher or lower level; mean difference is just NOT zero
|
|
what is meant by the onetailed form of alternative hypothesis?
|
we hypothesize not only difference but the direction of the difference
|
|
all of the different possible values of the mean differences and their associated probabilities are referred to as a
|
sampling distribution
|
|
test statistic probabilities only map accurately onto sampling distributions if certain ____________are met
|
assumptions
|
|
p value is called
|
significance level or alpha
p<.05 |
|
Risk of rejecting null hypothesis when it is true is called
|
Type I error
|
|
Risk of accepting null when it is not is called
|
Type II error
|
|
the degree of Type II error in an analysis is referred to as
|
beta
|
|
the power of an analyis refers to its capability of
|
of the statistical analysis to avoid Type II error or to detect a relationship or difference that is actually present
|
|
Increasing the sample size reduces
|
both Type I and Type II error
|