Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
48 Cards in this Set
- Front
- Back
Correlation
|
measures the strength of a certain type of relationship
|
|
regression
|
a numerical method for trying to predict the value of one measurement variable from knowing the value of another one
|
|
Statistical relationship
|
differs from deterministic relationship
|
|
Deterministic relationship
|
if we know the value of one variable, we can determine the value of the other exactly
|
|
Statistically significant
|
To determine, we ask what the chances are that a relationship that strong or stronger would have been observed in the sample if there really were nothing going on in the population
|
|
Correlation of +1, -1, zero
|
+1: perfect linear relationship
-1: perfect linear relationship; as one increases, other decreases Zero: no linear relationship or indicates that the best straight line through the data on a scatterplot is exactly horizontal |
|
Positive correlation
|
indicates variables increase together
|
|
Negative correlation
|
one variable increases, the other decreases
|
|
Equation for line
|
y = a +bx
|
|
Out of 100%, there will be about ___ statistically significant relationships
|
5%
|
|
Even minor relationships can achieve statistical significance if the ___ ____ is ____
|
sample size, large
|
|
The regression equation is the ___
|
line equation
|
|
Problems that can affect correlations
|
outliers, groups combined inappropriately
|
|
_% of all data points are corrupted
|
5%
|
|
7 Reasons two variables could be related
|
1) explanatory variable is the direct cause of the response variable
2) The response variable is causing a change in the explanatory variable 3) The explanatory variable is a contributing but not sole cause of the response variable 4) Confounding variables may exist 5) Both variables may resulst from a common cause 6) Both variables are changing over time 7) The association may be nothing more than coincidence |
|
Evidence of a possible causal connection exists when
|
1) There is a reasonable explanation of cause and effect
2) The connection happens under varying conditions 3) Potential confounding variables are ruled out |
|
Percentage with the trait
|
(number with trait/total)x100%
|
|
Proportion with trait
|
number with trait/total
|
|
Probability of having trait
|
number with trait/total
|
|
Risk of having trait
|
number with trait/total
|
|
Odds of having the trait
|
numberwith trait/number without, to 1
|
|
Baseline risk
|
risk without treatment or behavior
|
|
Relative risk of an outcome for two categoris of an explanatory variable is
|
the ratio of the risks for each category
|
|
Increased risk
|
(change in risk/baseline risk) x 100%
|
|
Increased risk
|
(relative risk - 1) x 100%
|
|
To compute odds ratio
|
Compute odds of a, odds for b, then a/b
|
|
Common ways the media misrepresent stats about risk
|
Baseline risk is missing, time period of risk not identified, reported risk is not necessarily your risk
|
|
Simpson's Paradox
|
Relationship appears to go in one direction if third variable is not considered, and another direction if it is
|
|
Selection ratio
|
ratio of the proportion of success from one group compared to another
|
|
Statistically significant
|
If a relationship as strong as the one observed in the sample (or stronger) would be unlikely without a real relationship in the population
|
|
Basic steps for hypothesis testing
|
1) determine the null hypothesis and the alternative hypothesis
2) Collect the data and summarize them with a single number called a test statistic 3) Determine how unlikely the test statistic would be if the null hypothesis were true 4) Make a decision |
|
Null hypothesis
|
there is no relationship between the two variables in the population
|
|
Alternative hypothesis
|
There is a relationship between the two variables in the population
|
|
Sacred rule
|
not acceptable to use the same data to determine and test hypotheses
|
|
Chi-square test
|
steps two to four of hypothesis testing
|
|
P-value
|
probability of observing a test statistic as extreme as the one observed or more so if the null hypothesis is really true. If p-value is .05, statistically significant
|
|
How to compute a chi-square statistic
|
Compute the extected counts, assuming null hypothesis is true. Compare the observed and excted counts. Compute the chi-square statistic.
|
|
If chi-squre statistic is at least 3.84, the p-value is .05 or less
|
Conclude that the relationship in the population is real. Relationship statistically significant, reject null hypothesis (no relationship in population), accept alternative hypothesis (there is a relationship).
|
|
If chi-square statistic is less than 3.84, the p-value more than .05
|
Isn't enough evidence to conclude that the relationship in the population is real. Relationship not statistically significant, do not reject null hypothesis (no relationship in population), relationship in sample could have occured by chance
|
|
Expected count =
|
(row total)(column total)/(table total)
|
|
Statistically significant doesn't always mean
|
the two variables have a relationship
|
|
Relative-frequency interpretation
|
Simply the relative frequency, over the long run, with which the coin lands heads up
|
|
Probability
|
the proportion of time it occurs over the long run
|
|
Determining probability of an outcome, two methods
|
1) make an assumption about the physical world (know how coins are made)
2) observe relative frequenct over many, many repetitions of the situtions (flip the coin) |
|
Personal probability
|
the degree to which a given individual beliefs the event will happen
|
|
Probability rules
|
1) If there are only two possible outcomes in an uncertain situation, then their probabilites must add to 1
2) If two outcomes cannot happen simultaneously, they are said to be mutually exclusive; the probability of one or the other of two mutually exclusive outcomes happening is the sum of their individual probabilities 3) If two events don't influence each other (independent), then the probability that they both happen is foud by multiplying their individual probabilities 4) If probability b is a subset of probability a, then b can't be higher than a's probability |
|
Expected value
|
represents average value of any measurement over the long run, expected
|
|
Computing expected value
|
multiply possible mounts with their associated probabilities, then add them all together
|