• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/83

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

83 Cards in this Set

  • Front
  • Back
Confounded variables
Two variables such that their effects on the response variable cannot be distinguished from each other
P-Value
The probability of the observed value or something more extreme under the assumption that the null hypothesis is true
Power of a Test
The probability of correctly detecting a true alternative hypothesis
Individual
The objects described by a set of data; person (animal), place, and thing. In a medical trial, the people in the study referred to as called Subjects
Residuals
The errors, or difference between the estimated response y hat and the actual measured response y collectively
Level of significance
Probability of rejecting a true null hypothesis
Inferential
Statistics involve methods of using information from a sample to draw conclusions regarding the population
Descriptive
Statistics involve methods of organizing, picturing, and summarizing the information from samples or populations
Type 1 error
Rejecting a true null hypothesis
Placebo effect
Occurs when a subject receives no treatment, but (incorrectly) believes he or she is in fact receiving treatment and responds favorably
Type II error
Failing to reject a false null hypothesis
Lurking variable
A variable that has an important effect on the response variable and the relationship among the variables in a study but is not one of the explanatory variables studied either because it is unkonw or not measured
Quantitative variable
A variable has a value or numerical measurement for which operations such as additions or averaging make sense.
Qualitative variable
A variable describes an individual by placing the individual into a category or group
Simple random sample
A sample selected in such a way that each individual is equally likely to be selected as well as any group of size n is equally likely to be selected
Outlier
A data value that falls outside the overall pattern of the graph
Sample space
The collection of all possible outcomes in an experiment
Parameter
A numerical measure that describes an aspect of the population
Statistic
A numerical measure that describes an aspect of a sample
Correlation coefficient
A numerical measure that assesses the strength of a linear relationship between two variables
Standard error
The standard deviation of a sampling distribution
Data is skewed to the left when the mean is to the __ of the median
left
The probability of the sample space is __
1
The probability of any event is between __ and __
1 and 0
Given P(E) = 0.38, then P(E^c) = __
1 - .38 = .62
Let n(E) be the number of outcomes in an event E. Given n(E) = 21 and P(E) = 0.7, then n(E^c) = __
9, Because P(E) = n(E)/n(E) + n(E^c)
Given A and B are dependent than P(A (upside down U ) B) = ___
P(A)P(B|A) or P(B)P(A|B)
Given A and B are independent then P(A|B) = __
P(A)
Given A and B dependent then P (A or B) = __
P(A) + P(B) - P(A and B) = P(A) + P(B) - P(A)(B)
Given A and B are mutually exclusive than P(A (upside down U) B) = __
0
Given that A and B are mutually exclusive than P (A|B) = __
0 = P(A (upside down U) B)/ P(B)
In a normal distribution, the empirical rule states approximately __ of the observed data fall within one standard deviation of the mean
68 %
In a normal distribution, the empirical rule states approximately ___ of the observed data fall within two standard deviation of the mean
95%
In a normal distribution, the empirical rule states approximately ___ of the observed data fall within three standard deviation of the mean
99.7%
To test a normality of a data set, a ___ plot can be constructed using the theoretical quantiles of a normal distribution and the empirical quantiles of the data set. If the points lie closer to a ___, then the data come from a distribution that is approximately normal.
normal quantile (or Q-Q), straight line
If the data has a normal distribution with mean (mu) and standard deviation (theta) then the transformed z-score: z= x -(mu)/(theta) has a ___ with mean of __ and standard deviation
standard normal distribution, 0, 1
Suppose that x-distribution has a normal distribution. The sample mean follows a ___
normal distribution
Suppose that x-distribution has a normal distribution with mean (mu) ands standard deviation (theta). A statistic x bar - mean/ s/ square root of n follows a ___ distribution with ___ degrees of freedom
Student's t, n -1 = sample size - 1
If x-distribution has a normal distribution with mean (mu) and standard deviation (theta), then z^2 = (x-mean/theta)^2 follows a ___ distribution with __ degrees of freedom
X^2, 1
As the sample size increase, the maximal margin of error ___
decreases
List three basic principles of experimental design:
Randomization, Blocking, and Replication
What is a discrete variable?
A variable that can only be a whole number (1,2,3,4) not like 1.5,2.37 and such
What is a continuous variable?
A variable where the values can be anything in between 1 or 2, 2 and 3 and so forth (values could be 2, 3.47, 6...)
Nominal measurement
Categorical data and numbers that are simply used as identifiers or names represent a nominal scale of measurement. Numbers on the back of a baseball jersey and your social security number are examples of nominal data. If I conduct a study and I'm including gender as a variable, I may code Female as 1 and Male as 2 or visa versa when I enter my data into the computer. Thus, I am using the numbers 1 and 2 to represent categories of data.Indicates difference.
Ordinal measurement
An ordinal scale of measurement represents an ordered series of relationships or rank order. Individuals competing in a contest may be fortunate to achieve first, second, or third place. first, second, and third place represent ordinal data. If Roscoe takes first and Wilbur takes second, we do not know if the competition was close; we only know that Roscoe outperformed Wilbur. Likert-type scales (such as "On a scale of 1 to 10, with one being no pain and ten being high pain, how much pain are you in today?") also represent ordinal data. Fundamentally, these scales do not represent a measurable quantity. An individual may respond 8 to this question and be in less pain than someone else who responded 5. A person may not be in exactly half as much pain if they responded 4 than if they responded 8. All we know from this data is that an individual who responds 6 is in less pain than if they responded 8 and in more pain than if they responded 4. Therefore, Likert-type scales only represent a rank ordering.
Indicates difference and direction of difference.
Interval
A scale that represents quantity and has equal units but for which zero represents simply an additional point of measurement is an interval scale. The Fahrenheit scale is a clear example of the interval scale of measurement. Thus, 60 degree Fahrenheit or -10 degrees Fahrenheit represent interval data. Measurement of Sea Level is another example of an interval scale. With each of these scales there are direct, measurable quantities with equality of units. In addition, zero does not represent the absolute lowest value. Rather, it is point on the scale with numbers both above and below it (for example, -10degrees Fahrenheit). Indicates difference, direction of difference, and amount of difference.
Ratio
The ratio scale of measurement is similar to the interval scale in that it also represents quantity and has equality of units. However, this scale also has an absolute zero (no numbers exist below zero). Very often, physical measures will represent ratio data (for example, height and weight). If one is measuring the length of a piece of wood in centimeters, there is quantity, equal units, and that measure cannot go below zero centimeters. A negative length is not possible. Indicates difference, direction of difference, amount of difference, and an absolute zero.
What graphs are used for qualitative data?
Pie Charts, Bar Charts
What graphs are used for measured/quantitative/continuous data?
XY Scatter Charts (2 variables), Line Graphs, Box plots, Stem and Leaf plots, Histograms
Calculate the total number of combinations
nCr or n!/r!(n-r!)
What are some kinds of discrete distributions?
Geometric, Binomial, Poisson
What are some kinds of continuous distributions?
Standard Normal, Chi Squared, Student's t
What determines how well a binomial distribution approximates to the normal?
However close the number events n is to 12 (rules of thumb is that the approximation works if n is greater than 5) and the probability of success p is to .5 (works best the farther away the probability of success is from 0 or 1)
What is true about a normal distribution?
the mean (mu) equals 0, the standard deviation equals 1, it's unimodal (has one mode), is bell shaped (mound)
What is true about a t-distribution?
it is symmetric about 0, it has heavier tails than a normal distribution (thus it is more prone to producing values that lie far from the mean), As it's degrees of freedom increase, it gets closer to N(0,1) (the normal distribution), Unimodal
How do you calculate a critical value for a chi squared distribution?
a = level of significance / 2
What is interpolation and extrapolation within the context of linear regression?
Interpolation is using x values that are between known x values to predict y hat.

Extrapolation is using x values that are beyond known x values to find p hat.

Example points: (0,1)(2,7)(5,13)

using an x of 4 to predict y is interpolation because it is between 0 to 5.

using an x of 7 is extrapolation because it is beyond 0 to 5.
Calculate class width
Max - Min data/ number of classes
Calculate class midpoint
Max value in class + Min value in class/2
Calculate class boundaries
-.5 to min value in class, +.5 to max value in class
Calculate relative frequency
Class frequency / Total Frequency
Calculate a five number summary for data
Min = lowest number in data set

Q2 = median

Q1 = median of the lower half

Q3 = median of the upper half

Max = Largest data value

Also, IQR (Interquartile range) is Q3 - Q1
What is the formula for E in a test where mu is unknown? What is it's derivative for n?
E = Z(critical value) x standard deviation/ square root of n

n = (z(critical) x standard deviation/ E)^2
What is the formula for E in a test for proportion (p and q)? How about from n =
E = z (critcial) x the square root of pq/n

n = pq x (z(critical) / E)^2

if there is no pq given, use 1/4

PS: FUCK ALL PROBLEMS THAT HAVE E
Calculate a point estimate (the shit they use in confidence interval stuff) for proportion
p hat = r/n
Find the maximal margin of error for a test of proportion
E = z(critical value) x square root of p hat x q hat/n
Calculate a confidence interval for everything
point estimate - E < whatever < point estimate + E
What do confidence intervals like 95% give you?
Critical values from z just by reading the chart (1.96)

For t though, you have to go down the left side for the closest degree of freedom and then across to the coinciding confidence interval. That gives you the critical for t.

For Chi-Squared, you use the degree of freedom to go down the line, and then go over to the number than is the compliment of the level of significance (5% level, go to .95)
Find a point estimate when mu is unknown and/or standard deviation is unknown for two sample means.
x bar 1 - x bar 2 = that point estimate
What is the formula for E in a test where both mu and standard deviation are unknown for two sample sizes n?
E = t (critical value) x square root of s1^2/n1 +s2^2/n2
Calculate a point estimate when mu and standard deviation are unknown
Its x bar nigga (sample mean), they give it to you.
Calculate a point estimate when mu is unknown
It's x bar nigga (sample mean), they give it to you
What is the maximal margin of error when mu is unknown for two sample means?
E = z(critical) x square root of standard deviation1^2/n1 + standard deviation2^2/n2
What is the formula for a test statistic where mu is unknown?
z = x bar - mu (stated in H0)/ standard deviation x square root of n
What is the formula for a test statistic when mu and standard deviation are unknown?
t = x bar - mu (stated in H0)/s/ square root of n
What is the formula for a test statistic of proportion?
z = p hat - p / square root of pq/n
For linear regression, the degree of freedom is
n - 2
What is the formula/formulas for a binomial distribution?
P(r) = nCrp^rq^n-r

mean (mu) = np

standard deviation = the square root of npq

when approximating to the normal use

z = x - mu / standard deviation
What is the formula/formulas for a geometric distribution?
P(n) = pq^n-1

mean (mu) = 1/p

standard deviation = (square root of q) / p
What is the formula/formulas for a poisson distribution?
P(r) = e^-theta x theta / r!

mean (mu) = theta

standard deviation = the square root of theta

For approximating to the binomial use

theta = np
What are the formulas for degree of freedom for Chi-Squared based tests?
(C-1)(R-1)

C = Number of columns

R = Number of Rows

or

k -1 = df

k = number of categories

(depends on the situation)
When do we reject H0 (null hypothesis)
When a > P-Value
What is the test statistic for two proportions? how do you calculate p bar?
p hat 1 - p hat 2 / p bar x q bar/n1 + p bar x q bar/n2

p bar = total r / total n