• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/145

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

145 Cards in this Set

  • Front
  • Back
Population
- Target group for inference
- Parameters are Numerical Characteristics
- Large, unobtainable, often hypothetical
Sample
- Sub group of population
- Statistics are Numerical Characteristics
- Obtainable, Small
μ (Mu)
Population Mean
x bar
Sample Mean
Regression Equation:
x
score on predictor variable
Simple Random Sampling
Independent Selection
REDUCES bias in generalizations
Research
Scientific Structured Solving Problem
Regression Equation:
a
y-intercept, value of y' when x=0
7 Topics that all Inferential Statistics have in Common
1) Use of Descriptive Statistics
2) Use of probability
3) Potential for estimation
4) Sampling Variability
5) Sampling Distributions
6) Use of a Theoretical Distribution
7) 2 Hypotheses, 2 Decisions, 2 Types of Errors
Steps of Scientific Methods
1) Encounter and identify problem
2) Formulate Hypotheses and Define Variables
3) Think through consequences of hypotheses
4) Design study, run it, collect data, compute statistic, test hypothesis
5) Draw Conclusions
Independent Variable (IV)
- Manipulated by researcher
- Researcher CHANGES values of the variable
- Comes first in time
Characteristics of Regression
1) linear only
2) generalize only for x values in your sample
3) y is different from y'; y=y'+e
4) error is e=y-y'
Dependent Variable (DV)
- Measured by researcher
- Follows IV
Extraneous Variable
- Should be controlled by researcher
- Competitors to IV
- Influence DV
Best Fitting Line
Stats b&a are computed so as to minimize the sum of e squared (Least Squares Principle)
Random Assignment
- Purpose: Control EV
- When: After random sampling; form groups out of entire sample
Variable
Entity that is free to take on different values
Partition Total Spread
- Total = Explained+NotExplained
- for both proportion of spread and amount of spread
Ways to control Extraneous Variables
1) Randomization of subjects to groups
2) Keep constant for all subjects
3) Include in design
Predictor Variable
Comes first in time, but not manipulated
Probability
Relative Frequency
Criterion Variable
Follows Predictor Variable
Sample Space
all possible outcomes of a research project
Operational Definition
Type of variable is assigned depending on how it is used in the study
Types of Relationships
Causal, Predictive
Causal Relationships
IV causes DV
Keys:
a) manipulation of IV
b) randomization of subjects to groups
c) replication - N>1 for each group
Elementary Event
any one data point in sample space
Predictive Relationships
PV predicts CV
Keys:
a) no manipulation
b) no randomization of subjects to groups
c) have replication
Event
any collection of Elementary Events
Types of Research
True Experiment
Observational Research
True Experiment
a) Manipulation of Variable
b) Randomization of subjects to groups
c) Replication
Observational Research
a) No manipulation of Variable
b) No randomization of subjects
c) Replication
P(Elementary Event)
1 / (total # in sample space)
Quantitative Data
Data has numeric value
P(Event)
(# in event)/(total #)
Qualitative Data
Data has numeric label
Aspects of Data
Middle - central tendency, location, center
Spread - variability, dispersion
Skewness - departure from symmetry
Kurtosis - peakedness relative to normal curve
Measures of Middle
mean, median, mode, T20
Conditional Probability
P(A|B)=(# in A and B together) / (# in B)
Measures of Spread
range, midrange, s*^2, s*, s^2, s
Standard/Unit Normal Distribution
Mu = 0
Sigma^2 = 1
Characteristics of a good measure of spread
1) Stat = 0 if spread is zero
2) As spread increases, stat increases
3) Stat measures just spread, not middle
Midrange (MR)
Upper Hinge - Lower Hinge
UH - LH
Median Position (MP)
(N + 1) / 2
Sampling Distributions Purpose
To get probabilities of stat to make inferences to get information necessary to estimate parameters
Hinge Position (HP)
([MP] + 1) / 2
Whiskers
Lines drawn from a hinge to an adjacent value
Sampling Distributions Definition
A distribution of a statistic could be formed by drawing all possible samples of a given size N from some population, computing the stat for each sample, and arranging these stats in a distribution
s*^2
Sample Variance
{Σ(x-xbar)^2} / N
3 things to know about sampling distributions of x bar
1) Mu of x bar = Mu
2) Sigma squared of x bar = sigma squared/N
3) Shape normal IF
a)Population is normal
ORb)N is large
s*
Sample Standard Deviation
√ s*^2
s^2
Unbiased Variance Estimate
{Σ(x-xbar)^2} / (N – 1)
Central Limit Theorem
Shape is normal if N is large
s
√s(squared)
Outliers
Any real data values outside whiskers
Unbiased
Mu of stat = desired parameter
z-score
aspect of data = relative position/standing

"something minus it mean divided by its standard deviation"
characteristics of z-scores
1) mean of a set = 0
2) variance of a set = 1
3) shape is the same as the shape of the distribution of the somethings
Characteristics of Normal Distributions
1) symmetric, continuous, theoretical, unimodal
2) bell-shaped
3) scores range from -infinity to +infinity
4) has 2 parameters (mu and sigma squared)
Hypothesis testing
the process of testing tentative guesses about relationships between variables and populations
2 Keys for probability in N(0,1)
1) distribution symmetric
2) total area/probability = 1
test statistic
a statistic used only for the purpose of testing hypotheses
Correlation and Regression have in common . . .
1) x,y pairs of scores
2) linear relationships
Correlation
stat = r
Purpose is to measure the degree of linear relationship
Regression
Purpose: to measure form of function of linear relationship
Prediction Equation: y'=bx+a
Assumptions
Conditions placed on a test statistic necessary for its valid use in hyp. testing
Characteristics of r
1) works with 2 variables, x&y
2) -1.005<=r<=1.00
3) measures only linear relationships
4) r(squared) = proportion of variability in y that is explained by x
5) r undefined if x or y has zero spread
6) r is demensionless
Assumptions for z of x bar
1) pop. of obs. normal
2) obs. are independent
Population Correlation Coefficient's Impact on r
- restriction of range
- combining data
- outliers
Correlation does NOT imply . . .
causation
Regression Equation:
y'
predicted score on y (criterion variable)
Null Hypothesis
H sub naught: The hypothesis we test (decision to reject or retain)
Regression Equation:
b
slope
Alternative Hypothesis
H sub one: where we put what we believe
Significance level
the standard for what we mean by a small probability in hypothesis testing, alpha = .05
Directional hypothesis
any Hypothesis with <,>,<=,>=
Nondirectional hypotheses
do not specify direction, eg not equal
One-tailed Test
uses only one tail of the sampling distribution of the test
Critical values
values of the test statistic that cut off alpha in the tail(s) of the sampling distribution
Rejection Values
valus of hte test stat for which we would reject Hnaught.
Critical Value Decision Rules
Reject null Hypothesis if the test stat is more extreme than a critical value
p-value Decision Rule
Reject Null if both are true:
1).5SAS p-value <= alpha
2) the result (test statistic) agrees with the alternative
Type I Error
Reject Null Hypothesis given Null is true
Type II Error
Retain Null Hypothesis given Alternative is true
p(Type I Error)=
p(rej. Null|Null true) = alpha
p(Type II Error)=
p(ret. Null|Alternative true) = Beta
Effect size relationship to power
as Effect size increase, power increases
N (sample size) relationship to power
as N increases, power increases
Sigma squared relationship to power
as sigma squared decreases, power increases
alpha's relationship to power
as alpha increases, power increases
directional hypothesis influence on power
gives best power if correct in predicting direction, but zero power if you are wrong
non-directional hypothesis influence on power
gives good power in both directions
properties of t-distribution
-A family of distributions
-Have one parameter = df
-Mu of t = 0
-sigma^2 of t =df/(df-2)]>1
-Symmetric,sort of bell-shaped
t=
(x bar-mu)/ sqroot(s^2/N)
degrees of freedom
-parameter of t-distribution
-(# of independent components - #of statistics)
-for 1-sample t, =N-1
-associated with s^2
one-sample t assumptions
-population bivariate normal
-subjects are independent
Correlation
-one sample
-x,y pairs
Null: rho=0
Correlation statistic
r
Correlation degrees of freedom
N-2
2 ind sample t-test
-2 independent samples
-Null: mu1=mu2
-Sigma^2 unknown
-Use when n1=n2>=15
2 ind sample t-test degrees of freedom
n1+n2-2
2 independent sample t-test assumptions
-Populations of observations are normal
-sigma1^2=sigma2^2
-observations are independent
AWS t'
-2 samples
-independent samples
-Null:mu1=mu2
-sigma^2 unknown
-use when n1=n2<15 or n1!=n2
AWS t' test statistic
(xbar1-xbar2)\
sqroot((s1^2\n1)+(s2^2\n2))
AWS t' degrees of freedom
n1+n2-2
AWS t' assumptions
-populations of observations are normal
-observations are independent
2 dependent sample t
-2 samples
-dependent samples
-x,x pairs
-sigma^2 unknown
-Null:Mu(sub d)=0
2 dependent sample t statistic
dbar-0/sqroot(s(sub-d)^2/N)

N=Number of pairs
2 dependent sample t statistic degrees of freedom
N-1

N=#of pairs
2 dependent sample t Assumptions
-population of d's is normal
-d's are independent
3 Ways of acquiring x,x pairs for 2 dependent sample t-test
1)researcher produced-researcher matches on extraneous variable
2)naturally occurring-come to researcher already paired
3)repeated measures-pre & past measurements
Robustness assumptions
1)normality-not met-robust
2)o1^2=o2^2-not met-robust if n1=n2>=15
3)independence-met-not robust
Robustness Definitions
1)the quality of a test stat when assumption is not met
2)the sampling distribution is well-fit by the theoretical distribution
3)alpha(true)~=alpha(set)=0.05: .04<=alpha(true)<=.06
1-way ANOVA
-for comparing multiple samples
-J=#groups, n=#obs, N=nJ
-Null:mu1=mu2=mu(sub-j)
-Alternative: any difference in mu(sub-j)s
Logic of the Anova
Part 1
1)We find 2 sample variances, one based on the x-bars, and one based on observations within groups: these 2-sample variances should estimate sigma^2 if Null is true, but estimate different quantities if alternative is true
Logic of the Anova
Part 2
2)We form an f-stat by putting the variance based on xbars in the numerator and variances based on observations in denominator.
If Null true, f~1.
If Alternative true, f>1.
ANOVA based on xbar
n*s(sub-xbar)^2
Null: estimates sigma^2
Alter:est. sigma^2+positive quantity
ANOVA based on observations
s(sub-pooled)^2=sum(s(sub-j)^2)/J
Null: est. sigma^2
Alternative: est. sigma^2
ANOVA F
F=n*s(sub-xbar)^2/
sum(s(sub-j)^2)/J
Null: expect F~1
Alternative: expect F>1
s*^2 Sampling Distribution
u=((N-1)/N)*sigma^2
shape: positively skewed
r Sampling Distribution
u=P if P=0
shape: symmetric, not normal
s^2 Sampling Distribution
u=sigma^2
shape: positively skewed
MCP
Multiple Comparison Procedure
Pairwise Comparisons
For J means, there are ... pair wise comparisons
C=J(J-1)/2
MCP hypotheses
Null: u(sub-j)=u(sub-j')
Alt: u(sub-j)!=u(sub-j')
Error Rates for MCP
α'=alpha for each comp
c=# pair wise comparisons
p()=p(at last one Type I error)
--goal to keep p() small
α'<=p()<=1-(1-α')^c<=c(α')
Error Rate per comp. (gives good power)
-Set α'=.05
-p() can be large
Error rate family-wise
Controls p() at α=.05 for c Comparisons by making α' small
Tukey MCP situation/hypothesis
J>=2 ind samples
H0: u(sub-j)=u(sub-j')
All p.w. comparisons
equal n's >=15
Tukey AND Fisher-Hayter test stat
t=xbar(sub-j)-xbar(sub-j')/
sqrt((MeanSquareWithin*2)/n)
Fisher-Hayter MCP situation/hypothesis
Overall ANOVA F must be significant
J>=2 ind samples
H0: u(sub-j)=u(sub-j')
All p.w. comparisons
equal n's >=15
Tukey and Fisher-Hayter assumptions
1.Pops. of obs. are normal
2.sigma(sub-j)^2=sigma(sub-j')^2
3.obs. are ind.
Tukey MCP distribution
q/sqrt(2)
J, degrees of freedom within, and alpha=.05
Fisher-Hayter MCP
q/sqrt(2)
J-1, degrees of freedom within, and alpha=.05
Tukey MCP robustness
very similar to 2 independent sample t
2-Way ANOVA Situation/hypothesis
2 Factors, J=# levels of A, K=# levels of B, n obs/cell, N=nJK
A)H0:u1=u2...=uJ
B)H0:u1=u2...=uK
AB)H0:No Interaction Effect
2-Way ANOVA test stat
FA=MSA/MSW=(SSA/dfA)/(SSW/dfW)
(FB and FAB, substitute B and AB for A, respectively)
dfA=J-1, dfW=JK(n-1)
dfB=K-1, dfAB=(J-1)(K-1)
2-Way ANOVA distribution
FA~F(sub-(J-1,JK(n-1)))
FB~F(sub-(K-1,JK(n-1)))
FAB~F(sub-((J-1)(K-1),JK(n-1)))
2-Way ANOVA assumptions
1.Pops of obs are normal
2.equal population variances for each cell
3.obs. are independent
Factor
a variable that classifies/groups subj.
1-Way ANOVA situation/hypothesis
1-factor
J>=2 independent groups
H0:u1=u2=uJ
H1:any difference in uJ
1-Way ANOVA test statistic
F=MSB/MSW=(SSB/dfB)/(SSW/dfw)
dfB=J-1 dfW=N-J=J(n-1)
1-Way ANOVA distribution
F~F(sub-(J-1),(N-J)
1-Way ANOVA assumptions
1.pops of obs are normal
2.sigma1^2=sigma2^2...=sigmaJ^2
3.obs are independent
Levels
a value of a factor (One in 1-Way Anova)