• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/78

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

78 Cards in this Set

  • Front
  • Back
What is science?
Objective, systematic pursuit of truth
What is essential for establishing a valid scientific theory?
Reproducibility
WHY does the science of statistics exist>
Because there are variations.
What is sampling biases?
One group in a population is over represented compared to another.
What problems lie in drawing conclusions that leads to distortion
1. Reporting invalid conclusions
2. Giving incorrect interpretations
WHat problems lie in data collection for distortion?
1. sampling and measurement biases
2. Ignoring influential variables
What problems line in data summarization that leads to distortion?
1. Graphically misrepresenting data
2. Choosing misleading statistics.
What is a sample?
A subset of a population
Why should we do experiments>
To establish cause and effect relationship
What is an observational study?
A study where researchers observe individuals and record info about variables of interest; no treatment is imposed.
What is the explanatory variable?
A set of treatments imposed on the subjects that may affect the outcome of the study
What is the response variable?
The outcome measured on each subject to reveal the effects of the treatment
what is a lurking variable?
A variable that affects the relationship between the response variable and the explanatory variable but is not included among the variables studied
What is a confounding variable?
Condition where the effects of two different variables on the response variable can't be distinguished from each other. These two variables are the explanatory and lurking variable
Data Collection
Collect relevant data that can be used to answer the research question
Data Analysis
Compute values necessary for statistical inference
Question Formulation
Articulate the research question. Determine what needs to be measured on each unit.
Statistical Problem Solving 4 Steps
1. State
2. Plan
3. Solve
4. Conclude
What is the population?
The entire group of individuals about which the researcher wants info for.
What is variable?
Characteristics measured or recorded on individuals in the sample or population.
What is a sample?
A subset; a part of the population that is selected and measured
What is Statistical inference
Conclusions are made about a population based on sample data.
What is Data summarization?
Data are graphed and means are computed
What is a convenience sample?
Sample chosen for ease of selection
What is a voluntary response sample?
Individuals are self-selected as they call into a survey number
What is a mall-intercept sample?
Mall shoppers are interviewed.
What is a quota sample?
Interviewers try to fill quotas and is EVIL. Doesn't allow you to calculate probability
What are the 4 bad sampling errors people make?
1. Convenience Sample
2. Voluntary Response Sample
3. Mall-Intercept Sample
4. Quota Sample.
What is bias?
A study that systematically favors certain outcomes.
What is a Simple Random Sample?
Every set of n individuals has an EQUAL change to be the sample actually selected.
what are the three probability samples?
SRS, Stratified sample, Multistage sample
WHat is a probability sample?
Everyone is a winner rule where a randomization device is used
Each individual has a chance of being selected.
We can compute the probability of getting each possible sample.
What is a stratified sample?
Classify into groups, and choose probability sample for each major, and then combine results to complete sample of students representing all majors.
What is multistage sample?
1, Select sample of groups.
2. Take a sample of sub group
3. units sampled from within each sub group.

(ex: farms in the US...choose siz states, then two counties, then two farms from each county.
WHat are the four source of bias in surveys?
1. Under-coverage
2. Non-response
3. Response
4. Question Wording
What is undercoverage bias?
Some groups in a population are left out when the sample is chosen (ex. household surveys exclude homeless, busy people, those in hospitals)
What is bias due to non response
Occurs when individual chosen in the sample refuses to provide answers or can't be contacted.
What is Bias due to Response: Respondent Bias?
When respondent gives responses that influence results in a systematic way. (students sampled at university asked if they cheat on exams)
What is bias due to response: Interviewer bias?
When an interviewer influences the response in a systematic way.

Black interviewer vs. white interviewer
What is bias due to question wording?
When questions have leading phrases, loaded words, or ambiguities that influence the response
In experimental design, how can we implement randomization?
1. Completely randomized design
2. Randomized block
3. Matched Pairs
WHat is a subject?
Individual, particularly a person, upon which an experiment is performend
What is a Treatment?
A specific experimental condition imposed on subjects
What is control?
Treatment without active ingredient (none or just a placebo)
What is a placebo?
A dummy treatment that outwardly resembles the active treatment
What is an interaction?
A condition where the effect of one variable on the response variable changes depending on the level of another variable.
What is a randomized comparative experiment
An experiment that uses BOTH comparison of two or more treatments and chance assignment of subjects to treatments
WHat are the two Randomized Comparative Designs?
1. Completely randomized design
2. Randomized block design
What are three principles where you can have a randomized comparative experiment?
1. Have a control of comparison
2. Have Randomization
3. Have Replication
What is Replication?
Apply each treatment to more than one subject in each treatment group.
What are three principles of Good Experimental Design?
1. Control or Comparison
(Neutralize effect of lurking variable and measure treatment differences)
2. Randomization
(Eliminate bias and invoke assumptions for statistical inference)
3. Replication
(To measure and reduce chance variation in results)
What is observed effect?
Difference between what we see in data and what we expect to see in the data.
What is statistically significant?
observed effect that is too large to attribute plausibly to chance variation.
What are some cautions in experiments?
1. hidden bias
2. placebo effect
3. lack of realism
What is hidden bias?
Bias that is introduced by not treating all individuals identically after treatments are applied
What is lack of realism?
Subjects, treatment, or setting does not realistically duplicate the conditions we want to study.
What is a distribution?
Every possible outcome and how often it occurs
What is data?
a set of measurements made on a group of indiciduals.
What are three things to describe distributions with?
Shape
Center
Spread
When should we use histogram vs. dot plots?
Histograms: use for large data
Dot plot...or stem for small data
What are the three measures of spread?
Range, IQR, and SD.
Between Q1 and Q3, how much of the data lies?
50%
What is an outlier?
if observation is < Q1-(1.5*IQR)
or
if observation> Q3+(1.5*IQR)
what is SD?
the average deviation from the bean
What are properties of SD?
1. Measures spread of data about the mean
2. Is either 0 or positive
3. Has same unit of measurement as original observation
4. Inflated by outliers.
When scalculating SD, always use n-1 on denominator.
yup
What are advantages of SD?
1. most commonly used
2. Uses deviations from every data point unlike range and IQR
Has well established theoretical properties

BUT inflated by outliers or strong skewness
What is relationship with SD and risk?
Lower SD=Smaller Risk.
What is a density curve?
smooth curve that describes the overall pattern of a distribution
What are properties of a density curve?
always above or on the x-axis
total area under curve equals ONE
Area under curve between two values=proportion of population expected in that interval.
What should we model data with density curves?
1. easier to investigate population properties
2. Can estimate probabilities of various outcomes
What are the most common density curve?
Bell-shaped.
What is the 68-95-99.7 rule
The area under the curve. halved is .15-2.35-13.5-34-34-13.5-2.35-.15
why do we randomize?
To decrease bias
Why do we replicate?
To measure and reduce chance variation
Probability samples are such that
Each member of the population has a chance of being selected and that chance can be computed
Why do we compare different treatment groups in experiments?
To neutralize the effects of lurking variables and measure treatment differences
All voluntary responses are samples of convenience
Yup