• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/65

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

65 Cards in this Set

  • Front
  • Back
Definition:

the science of collecting, organizing, and interpreting data
statistics
Enumerative or Analytical?

we have a finite population
enumerative study
Enumerative or Analytical?

we have an infinite/conceptual population
analytical study
Parameters or statistics?

numerical summary of a population
parameters
Parameters or statistics?

numerical summary of a sample
statistics
Definition:

drawing conclusions from the information based on a sample
statistical inference
Classification of variables:
categorical/qualitative or quantitative?

places an individual into one of several groups or categories
categorical/qualitative variable
Classification of variables:
categorical/qualitative or quantitative?

a number
quantitative variable
Discrete and continuous variables are a subcategory of _________ variables.
quantitative
Discrete or continuous variables?

has either a finite number of possible values or a countable number of values
discrete variable
Discrete or continuous variables?

has an infinite number of possible values (takes values in intervals or a continuum)
continuous variable
Observational or experimental study?

investigator's role is basically passive - no attempt is made to manipulate or influence the variables of interest
observational study
Observational or experimental study?

investigator's role is active - variables are manipulated, the study environment is regulated
experimental study
Observational or experimental study?

treatments are applied to experimental units to try to determine the effects of the treatment on the response variable
experimental study
Definition:

- the goal is to obtain individuals in such a way that accurate information may be obtained about the population
sampling
Name the 5 possible sampling biases.
Selection Bias
Measurement Bias
Response Bias
Non-response Bias
Question Bias
How do we avoid selection bias?
Simple random sample (SRS)
Definition:

size n from a population, random selection of sample population
Simple random sample
Definition:

individuals easily obtained, most popular example is when individuals are self-selected
Convenience sample
These are requirements of ________.

intervals must be non-overlapping
intervals must be contiguous
intervals must be equal width
Histograms
Matching:

- easy to calculate
- easy to work with algebraically
- highly affected by outliers
- not resistant to extreme observations
Mean
Matching:

- more resistant to a few extreme observations
- robust
Median
Matching:

- where is the peak
- the most frequent value in the data
- possible to have more than one
- important for categorical data
Mode
Given a right-skewed histogram, list the mean, median, and mode in order from smallest to largest.
mode < median < mean
Given a left-skewed histogram, list the mean, median, and mode in order from smallest to largest.
mean < median < mode
Name the effects of adding a constant to all data points on the mean, median, variance, and standard deviation.
- add the same constant to the mean
- add the same constant to the median
- variance and standard remain the same
Name the effects of multiplying all data points by a constant on the mean, median, variance, and standard deviation.
- multiply the mean, median, and mode by the same constant
- variance will be the constant squared x the original variance (s^2*const^2)
- standard deviation will be the constant x the absolute value of the original st. dev.
Name the effects of adding to the maximum data point on the mean, median, variance, and standard deviation.
- increases mean
- median stays the same
- variance and std. dev. will increase
The Empirical Rule only applies to what type of histogram?
symmetric, unimodal, bell-shaped
According the Empirical Rule, roughly _____ % of data will fall between x-s and x+s (x = mean, s = st. dev)
68
According to the Empirical Rule, roughly _____ % of data will fall between x-2s and x+2s (x = mean, s = st. dev)
95
According to the Empirical Rule, roughly _____ % of data will fall between x-3s and x+3s (x = mean, s = st. dev)
99.7
Definition:

observations that lie outside the overall pattern of a distribution
outliers
Definition:

- median of the observations who are less than the overall median
Q1
Definition:

- median of the observations who are greater than the overall median
Q3
The inter-quartile range (IQR) is given by:
Q3-Q1
The upper fence is given by:
Q3 + 1.5*IQR
The lower fence is given by:
Q1 - 1.5*IQR
Anything above the upper fence or below the lower fence is considered a(n):
outlier
What points are included in the 5 number summary?
min, Q1, Q2, Q3, max
Definition:

- variable that is monitored as characterizing system performance/behavior
response variable
Definition:

- variable over which an investigator exercises power, choosing a setting(s) for use in the study
supervised (managed) variable
Name the 2 kinds of supervised variables.
Controlled, experimental variables
Definition:

- supervised variable with one setting (held constant)
controlled variable
Definition:

- supervised variable with more than one setting
experimental variable
Definition:

- categorical variables whose effect on the response variable we want to investigate (ex: brand)
factors
Definition:

- the values of a factor (ex: brand A, brand B)
levels
Definition:

- a combination of the values (levels) of each factor
treatment
Name the 4 ways to deal with experimental error.
controlled variables
randomization
blocking
replication
Definition:

- variables kept constant across experimental units
controlled variables
Definition:

- chance determines the assignment of treatments
- does not reduce experimental error, but averages out effects of lurking variables over treatments
randomization
Definition:

- chosen to be fairly homogeneous
- controls for differences in each because all the outcomes in each are affected similarly
blocking
Definition:

- multiple experimental units per treatment (not just multiple measurements of experimental units, which only captures measurement error)
- to be able to see trends vs. "flukes"
- to quantify the amount of experimental error
replication
What are the goals of using controlled variables?
- to keep the effects of a controlled variable from affecting conclusions about treatment effects
- to reduce experimental error
What are the goals of using replication?
- to generalize to a larger population
- to allow us to quantify experimental error
Definition:

- experimental design with 2 or more categorical experimental variables (factors)
Factorial Design
3x4

Give the number of factors and each factor's corresponding number of levels.
Give the number of treatments.
2 factors.
Factor 1 has 3 levels, factor 2 has 4 levels.
3x4 = 12 treatments
Definition:

- an effect attributable to combinations of variables above and beyond what can be predicted from the variables considered individually
Interaction Effects
True or False?

The standard deviation can be negative.
False
True or False?

The median is less sensitive than the mean to outliers.
True
A television station is interested in predicting whether or not voters in its listening area are in favor of federal funding for abortions. It asks its viewers to phone in and indicate whether they are in favor of or opposed to this. Of the 2241 viewers who phoned in, 1547 (70.24%) were opposed to federal funding for abortions. The number 70.24% is
A. a statistic
B. a parameter
C. a sample
D. a population
A. a statistic
A survey records many variables of interest to the researchers conducting survey. Below are some of the variables from a survey conducted by the U.S. Postal Service. Which of the following variables in categorical?
A. country of residence
B. number of people, both adults and children, living in the household
C. total household income, before taxes, in 1993
D. age of respondent
A. country of residence
In order to determine if smoking causes cancer, researchers surveyed a large sample of adults. For each adult they recorded whether the person had smoked regularly at any period in their life and whether the person had cancer. They then compared the proportion of cancer cases in those who had smoked regularly at some time in their lives with the proportion of cases in those who had never smoked regularly at any point in their lives. The researchers found there was a higher proportion of cancer cases among those who had smoked regularly than among those who had never smoked regularly. This is
a. an observational study
b. a designed experiment
c. a block design
d. a controlled study
a. an observational study
Sickle-cell disease is a painful disorder of the red blood cells that affects mostly African Americans in the United States. To investigate whether the drug hydroxyurea can reduce the pain associated with sickle-cell disease, a study by the National Institute of Health gave the drug to 150 sickle-cell suffers and a placebo to another 150. The researchers then counted the number of episodes of pain reported by each subject. The response variable is
A. the drug hydroxyurea
B. the number of episodes of pain
C. the presence of sickle-cell disease
D. the number of red blood cells
B. the number of episodes of pain
What are the goals of using randomization in an experiment?
To provide protection against a systematic effect of extraneous/lurking variables from affecting conclusions about treatment effects