• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/76

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

76 Cards in this Set

  • Front
  • Back

Statistics

Is a branch of mathematics used to summarize, analyze, and interpret what we observe. So that we can make sense of the meaning of our observations. Really, statistics is used to make sense of the observations we make. It is commonly applied to evaluate scientific observations.

Descriptive statistics

Scientists organize and summarize information such that the information is meaningful to those who read about the observations scientists made in a study. This is basically using statistics to describe.

Inferential statistics

Scientists use information to answer a question or make an actionable decision. Usin statistics to infer.

What were the words of Mark Twain about statistics and what did he mean by them?

He said "There are lies, damned lies, and statistics" and he meant that statistics could be deceiving and so could interpreting them.

Descriptive statistics

Applying statistics to organize and summarize information.

Inferential statistics

Applying statistics to interpret the meaning of information.

Data

Are a set of scores, measurements, or observations that are typically numeric. A datum (singular) is a single measurement or observation usually referred to as a score or raw score.

What is the general structure for making an observation?

Ask a question.


Set up a research study.


Measure behaviour.


Evaluate findings.

What is the purpose and advantage of tables and graphs?

Tables and graphs serve a similar purpose to descriptive statistics, which is to summarize large and small sets of data. Their advantage is that they can clarify findings in a research study.

The sample

A portion of all members of a group.

The population

All members of a group

Why do scientists choose to select a sample?

Most scientist have limited acces to the phenomena they study, especially behavioural phenomena. Hence, researchers select a portion of all members of a group mostly because they do not have access to all members of a group.

Why do researchers require statistical procedures like inferential statistics?

To infer that observations made with a sample are also likely to be observed in the larger population from which the sample was selected.

Population parameter

A characteristic (usually numeric) that describes a population.

Sample statistic

A characteristic that describes a sample. A sample statistic is measured to estimate the population parameter.

Scales of measurement

These identify how the properties of numbers can change with different uses. The four scales of measurements are: Nominal, Ordinal, Interval and Ratio.



S.S Stevens coined this term, these rules imply that the extent to which a number is informative depends on how it was used or measured.

Scales of measurement can be characterized by three property which can be described by answering a question, name the properties and these questions.

Order: Does a larger number indicate a greater value than a smaller number?



Difference: Does subtracting two numbers represent some meaningful value?



Ratio: Does dividing (or taking the ratio of) two numbers represent some meaningful value?

Nominal scales

Nominal scales are measurements in which a number is assigned to represent something or someone. They provide no additional information. Examples: ZIP codes, license plate numbers, credit card numbers.

Coding

Is the procedure of converting a nominal or categorical variable into a numerical value.

Why is coding words into numerical values useful?

This is useful when entering names of groups for a research study into statistical programs such as SPSS because it can be easier to enter and analyze data when group names are entered as numbers not words.

Ordinal scale

Are measurements that convey order or rank alone. So it only conveys that some value is greater or less than another value (i.e. order). Examples: finishing order in a competition, education level. Differences between rank do not have meaning because here they simply say that one thing is greater than the other.

An interval scale of measurement can easily be understood by two defining principles, what are these principles?

Equisdistant scales.


No true zero.

Interval scales

Are measurements that have no true zero and are distributed in equal units (so equidistant scales) This means that it has values or intervals that are equally diastributed. And no zero meaning the zero is not necessarily the end.

Rating scale

Is a type of scale that is a numeric response scale used to indicate participant's level of agreement with or opinion of some statement.

Equidistant scale

Is a scale with intervals or values distributed in equal units. Many behavioural scientists assume that scores on a rating scale are distributed in equal units.

A true zero

Is when the value 0 truly indicates nothing on a scale of measurement. Interval scales do not have a true zero.

Ratio scales

Are measurements that have a true zero and are distributed in equal units. Examples: think of length, weight, height.

The variables for whcih researchers measure data fall into two broad categories, can you name them?

Continuous and discrete variables.


Quantitative and qualitative variables.

Continuous variable

Is a variable that is measured along a continuum. So continuous variables are measured at any place beyond the decimal point. A continuous varibale can thus be measured in fractional units.

Discrete variable

This is measured in whole units or categoires that are not distributed alopng a continuum.

quantitative variable

Varies by amount. This variable is measured numerically and is often collected by measuring or counting. So it is measured by numeric units.

Qualitative variable

These variables vary by class. They are often labels for the behaviours we observe, so only discrete variables can fall into this category. This variable is often represented as a lable and described nonnumeric aspects of phenomena.

Measures of central tendency

Are statistical measures for locating a single score that is most representative or descriptive of all scores in a distribution. So single values that have a 'tendency' to be near the 'center' of a distribution. These ensure that the singel score meaningfully represents a set of data.

Notation used for population size

N

Notation used to refer to the sample

n

N

Population size

n

Sample size.

Population size

Is the number of individuals who constitute an entire group or population. The population size is represented by the capital N.

The mean

Is also called the arithmetic mean or average, it is the most commonly reported measure of central tendency. It is the sum of a set of scores divided by the number of scores summed, in either a sample or a population.

Formula for the population mean

Is the sum of N scores (x) divided by N: Schrijf het op en vind in notion.

Formula for the sample mean

The sum of n scores (x) divided by n:


Schrijf op en vind antwoord in notion.

The mean is also often referred to as something else, what is this?

Balance point, because it is the value that balances an entire distribution of numbers. This means it might not always be the center of a distribution or a middle value.

The median

Is a middle value in a distribution. The median value represents the midpoint of a distribution of scores where half the scores in a distribution fall above and half fall below its value. So the middle value in a distribution of data listed in numeric order.

What is the reason for the median?

The mean can be misleading when a data set has an outlier because the mean will shift toward the value of that outlier. For that reason an alternative measure of central tendency is necessary.

How do you find the median position

You list a set of scores in numeric order and compute the formula: write it down and find the answer in notion.

When is the median the most informative?

In a distribution with one or more outliers because outliers have little influence over the median.

The mode

Is the score that occurs most often in a data set. An advantage here is that it is simply counting, no calculations necessary.

How do you find the mode?

You list a set of scores in a numeric order and count the score that occurs most often.

The mean is the most reported statistic in behavioural science, it has five characteristics which reflect every score in a distribution, name these and name why.

Changing an existing score: Changing an existing score will change the mean. In essence, every score in a distribution affects the mean. Therefore, changing any existing score in a distribution will change the value of the mean.



Adding a new score or removing an existing score: Adding a new score or removing an existing score will change the mean, unless that value equals the mean.



Adding, subtracting, multiplying, or dividing each score by a constant: Adding, subtracting, multiplying, or dividing each score in a distribution by a constant will casue the mean to change by that constant.



Summing the differences of scores from their mean: Think of this as balancing weights on a scale. Only when the difference between the weights on each side of the scale is the same will the weights on each side of the scale be balanced. Only wehn the mean is subtracted from each score in a distribution is the sum of the differences equal to zero.



Summing the squared differences of scores from their mean: The sum of the squared differences of scores from their mean is minimal.

The normal distribution

Also called symmetrical Gaussian, or bell-shaped distribution is a theorethical distribution in which scores are symmetrically distributed above and below the mean, the median, and the mode at the center of the distribution. The mean is usually used to sumamrize data that is distributed like this, and it is mostly used because all scores are included in its calculation.

Skewed distribution

Is a distribution of scores that includes scores that fall substantially above or below most other scores in a data set. Some data can have scores that are unusually high or low that skew a data set.

A modal distribution

Is a distribution of scores in which one or more scores occur most often or most frequently.

A unimodal distribution

Is a distribution of scores in which one score occurs most often or most frequently. A unimodel distribution has one mode.

Bimodal distributions

Have two modes, the mean and the median are typically located between the two modes in a bimodal distribution. So it is a distribution of socres in which two scores occur most often or most frequently.

A multimodal distribution

Is a distribution of scores in which more than two scores occur most often or most frequently. A multimodal distribution has more than two modes.

A nonmodal distribution

Also called a rectangular distribution, is a distribution of scores in which all scores occur at the same frequency. A nonmodal distribution has no mode.

Unit of analysis

The what or who that is being studied.

Variable

Is a measured property of the unit of analysis.

There are 3 forms of statistics talked about in this course, what are they?

Univariate.


Bivariate.


Multivariate.

Univariate

We are doing some kind of analysis of description of one variable, an example would be: what was the average grade of something? So here we only have one variable.

Bivariate

Here you deal with two variables and we start talking more about relations between variables. So here we have two variables. Examples: Do female and males differ in their grades? Here there are two variables and we're seeing how they relate to each other.

Multivariate

Here we deal with more than two variables and we deal with the relationship between these variables.

What is the main purpose/problem of statistics?

To think about how our sample is representative of a larger population.

Descriptive statistics

Are any kind of descriptions of a variable, it is the description of data that you have measured or found.

Inferential statistics

Is the science that talks about how can we say from the sample descriptives what is happening in the population. So makinfg an inference based on a sample about a larger population.

Coding

Is a way to conveniently store data, we very often code our categories to numbers. The coding is completely arbitrary, there is no logical order in coding.

Nominal

A group of classifications, which means people answer in categories. There is no meaningful ranking possible, so there is no order (i.e. 3 is not necessarily more than 2). Numerical coding is arbitrary, but necessary in SPSS.

Ordinal

These are variables in which there are orders in the categories. Meaningful ranking, here 3 is more than 2 so it would not make sense to change the order of rankings here. Distance between the categories is not known, hence the difference between never and once a week is not equal to the difference between once a week and several times a week.

Interval

It is the trickiest level of measurement. It means that there is an order to the answers adn that we have equal difference between the scores but that the zero is a social construct. The standard example: degree in celsius. Here the zero is a social construct.

Ratio

This is not decided by humans and is a much more objective property that we measure, so for example: age measured by years. The zero here is meaningful, so it can be a count of how many year you have lived because you cannot go below zero.

There is a hierarchy in the levels of measurement, explain this.

We oftens ay the nominal is the lowest level of measurement moving up to ratio which the highest level of measurement. From a statistical point of view we like the higher level variables more, because they are meaningful. Although sometimes there are good reasons for liking lower level variables.

Continuous variable

One that you can measure along a continuum. Let's take height for example as a variable here the person is a unit of analysis and this is then a continuous vairable because you can zoom in infinitely to get a more precise answer.

Discrete variable

You measure this in whole units or categories. This happens for example wehn you ask for the average of the number of children in a country, you have to do this in whole units because you cannot count half of a child.

The mean

This is the average it is what you calculate when you are just trying to make the average of an interval or ratio level variable. This is useful for normal distributions. We can only use this for ratio and interval level variables because of the fact that the distances are equal, this is not the same for ordinal level variables and so you cannot calculate the mean.

The median

You can use this for ordinal, interval and ratio level variables, because there is an order in whcih you can put people from low to high. The median is the middle score of a distribution if you would order all the answers from low to high. The median is also called the 50th percentile meaning that if you would take a percentage the median would be at 50% of the group.

The mode

The answer that appears the most, you can sue this for all four variables.