Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
104 Cards in this Set
- Front
- Back
signifcance threshold p<0.05
|
less than 5% probability random chance
|
|
can only generalise findings if sample is.....
|
..... representative
|
|
probability sampling: SSC
|
simple random=everyone equal chance of being selected
stratified=reflecting characteristics but other than that random 70% male n 30% female cluster= select units in population representing wider population |
|
non probabiliy sampling QSOP
|
quota= collect data until reach numbers needed
snowball= start with a group and then meet similar ppl (good investigating illegalities) opportunity= common when happen to be there at the time purposive=specific, women aged between 30-35 |
|
normal distribution characteristics
|
1. theory stats based on the observation of many naturally occuring interval level variables
2. distribution is bell shaped 3. symetrical around mid-point at whch mode, median and mean all are equal 4. tails indefinite never meet horizon 5.areas unders the curve between the midpoint, 1, 2, 3 SDeviations all known |
|
standard deviation SD
|
measure of dispersion/spread of data from the mean in a sample
|
|
z scores
|
how many SD's a value is away from the mean.
z= 0 = the mean z = +1= 1 SD above the mean z= -1= 1 SD below the mean *info on score of individual compared to rest of sample *standardise scores so can compare across studies and samples |
|
type I error
|
hypothesis is true but reject
|
|
type II error
|
hypothesis false but accept
|
|
p<0.05 5% chance....
|
.... 1 in 20 that there is random chance and not a genuine diff
|
|
chi square tests...
|
.... 2 categorical variables
|
|
correlation tests...
|
.... 2 continuous variables
|
|
regression/multiple tests...
|
... Continuous DV's and multiple IV's
|
|
logistic regression tests...
|
.. dichotomous DV
|
|
T-Test tests....
|
.... 2 groups continuous DV's
|
|
ANOVA tests...
|
... 3 + groups continuous DV
|
|
pretests on data...
|
.... ensure assumptions are met
|
|
pearsons chi square...
|
.... association between categorical data
|
|
crosstabulations...
|
.... compare distributions of frequencis within particular categorical conditions
|
|
degress of freedom
|
extent to which data are free to vary. as complexity increases df increases
|
|
assumptions chi square data
|
data categorical. at least 5 expected frequencies within each cell
|
|
p<0.001
|
less than 1 in 1000 chance
|
|
SD 1, 2, 3, 4
|
50%, 34.13%, 13.59%, 2.15%, 0.13%
|
|
sampling distributions
|
always normally distributed as ave of sample any extremes already pulled in
|
|
standard deviation when individual data
|
standard error when in groups
|
|
standard erro of the mean
|
standard deviation of sampling distribution of the mean
|
|
95.44% of sample means will have the population mean within..
|
.... 2 standard error of the means
therefore population mean likely to lie within 2 std errors of the sample mean, 95.44 times out of 100 |
|
1 sample T test..
|
... is sample significantly diff from wider population
|
|
parametric tests
|
T-tests ANOVA (homogenity of variance assumptions apply with independant tests)
|
|
Levenes is a pre-test in T test statistics
|
if sig assumptions of homogenity of variance violated and equal variances not assumed. have to dobefore interpreting t test
|
|
ANOVA
|
analysis of variance test diff lele interval data (extension of t test)
|
|
ANOVA... example
|
DV test scores and 3 unis as IVs
|
|
the higher the f statistic the greater the variance between groups rather than within groups`
|
higher f more confident we are that groups are different
|
|
f(2,21)
|
2= DF between 21= DF within
|
|
TUKEY HSD adjust for family wise error rate
|
shows honestly exactly which groups are sig diff.
divides 0.05 by 3 to avoid type 1 error rate |
|
tkey post hoc test
|
after event analysis
|
|
lavenes test, pre test
|
if sig then interpret welch
|
|
pearsons correlation
|
interval level data testing correlations between 2 variables of interval/ratio level data
|
|
pearson correlation magnitude
|
0.1-0.3 = weak
0.3-0.59=moderate 0.6-0.99=strong 1=perfect |
|
r square
|
proportion of variance in Dv accounted for by IV expressed as a percentage (r2=0.25 = 25%)
|
|
r square is high 80-90%...
|
... good thing as line fits well and a larger value means not much left to explain about y other than x using its relationship with y.
if closer to x then need other variable to explain y other than one already tried |
|
assumptions of regression
|
normailty of variation around line of regression
independance variation around line of regression |
|
reporting findings of multiple regression:
significance of model, variance accounted for, interpreting describing findings |
f stat
r2 beta |
|
beta coefficient howell 2002
|
the unique contribution of each variable to the predictions of Y (DV)
|
|
N cant be less than 50
|
sample size cant be less than 50, recommendation is 100
|
|
logistic regression
|
IV's are categorical, targets met or not met
|
|
logistic regression assumptions
|
IV's continuous or dichotomous
NOT assume continuous IVs normally distributed assumes multicolinearity DV must be coded 0 or 1 (predict likelyhood being 1s rather than 0s) |
|
zero order correlation aka
|
pearson r coefficient
|
|
parametric assumptions
|
assumptions about characteristics population
tests=t tests and anova 1. random sampling 2. no bias 3. normal distribution 4.minimum interval level data 5. homogenity of variance of diff samples (only with independant tests not related tests) |
|
parametric assumptions important...
|
... tests are attempting to estimate unkown population parameters by using sample statistics. parameters constranied by assumptions
|
|
para assump overcoming paradox sampling....
|
.... misleading unless representative of population how can u tell if representative unless already know what need to know
|
|
insufficient questions
|
wording
clarity of terminology scales of response, likert appropriateness of topics |
|
construct
|
concept trying to measure
|
|
indicator
|
what we use to tap into that construct
|
|
single indicators are poor measures
|
prone to measurement error unreliable
|
|
composite measures..
|
... constructed multiple indicators in form of an ave same q asked in diff ways over again
|
|
face validity
|
looking like will measure construct
|
|
criterion validity
|
does it demonstrate it measures constuct validity
|
|
factor analysis
|
assesses criterion and discriminant validity (appropriate multi indicator measures and latent constructs)
|
|
factor loadings indicate
|
strength of assoiations between indicator and latent construct
|
|
latent constructs
|
themes underlying all questions
|
|
FOWLER surveys answers of interest...
|
... not intrinsically but because of relationship something is supposed to measure
|
|
reliability...
|
.... making sure that diff in answers stems from diff among respondents not diff in stimuli exposed to
|
|
bad wording
|
age last birthda
|
|
defining breakfast
|
meal before 10am
|
|
nt asking multiple questions
|
wanna be rich and famous
|
|
avoid why questions
|
?
|
|
validity subjective states
|
cannot be verified, opinions cant be verified
|
|
nominal
|
ppl sorted into unordered categories
|
|
ordinal
|
ppl ordered into category along single dimension, good, bad
|
|
interval
|
numbers ttached providing meaningful info about distance between stimuli
|
|
ratio
|
ratios between value meaningful as well as intervals between them, distance weight
|
|
open q's good..
|
.... obtain answers unanticipated
|
|
closed q's good...
|
... more reliable when alternatives
if an answer scale can compare acorss groups and time |
|
ask about events in last six months
|
even if want to know about last year as reporting error
|
|
sensitive or embarrassing q's...
|
... emphasise no judgement
self administered rather than face to face confidentiality and anonymity |
|
reducing measurement error in question design
|
one of least costly ways of improving survey estimate
|
|
5 key characteristics of theoretically normal distribution
|
1. theory of statistics based upon observation of many naturally occurring interval level variables
2. distribution is bell shaped 3. curve symmetrical around single mid point at which mode/median/mean all fall and equal to each other 4. tails indefinite never quite meet beyond horizon 5. areas under the curve between midpoint, 1, 2, 3 SD's known |
|
reading z scores, how many SD's any 1 sample is away from the mean
|
0=50% 1= 34.13% 2=13.59% 3=2.15% 4=0.13%
|
|
sampling distribuiton
|
collecting the mean a number of different times from a number of different samples
|
|
standard deviation=measure of dispersion/spread away from mean
|
standard error of the mean= SD of sampling distribution of the means
diff is SD individuals spread of data Standard error of the means is samples spread of data |
|
approaches of surveying/collecting a sample:
probability:SSC simple random stratified cluster |
non-probability= QSOP
quota snowball opportunity purposive |
|
simple random
|
everyone in pop equal chance of being asked
|
|
stratified randomw
|
reflecting characteristics randomly
|
|
cluster
|
large population cant do SS so unit in pop represent wider pop
|
|
quota
|
collecting data until reach targets
|
|
snowball
|
start with a small group and find similar ppl
|
|
opportunity
|
happen to be there at time
|
|
purposive
|
women aged between 35-39
specific |
|
confidence intervals
|
where most samples lie in relation to wider population 68.26%of sample means have population mean within 1 SD
|
|
95.44% confidence within 2 Standard error means
|
94.55 times out of 100 the population mean will lie between -1 or +1 standar errors of the sample mean
|
|
parametric assumptions apply in t-tests and ANOVAs assumptions concerning underlying populations where samples come from
|
1. random sampling
2.no bias 3.minimal of interval level data(weight/test score) 4.homogenity of variance of diff sample means (only for independant tests not related tests) 5. normal distribution |
|
if levenes IS SIG
|
NOT homogenous variances do NOT assume equal variance
|
|
need parametric tests as tests...
|
... attempt to estimate unkown population parameters by using sample statistics
|
|
paradox need to know about population
|
without being able to test whole population
|
|
pearsons correlation coefficient aka
|
zero order coefficient
|
|
reprort pearsons as..
|
... r=0.97, p<0.001
|
|
r2 variance accounted for in other variables
|
variance in DV accounted for by IV
|
|
significance of model f stat
|
f(1,(df regression)7(df residual)=122.930 (F in anova) p<0.001
|
|
reporting a beta
|
beta=0.973, p<0.001
|
|
MINIMUM OF 50 CASES NEEDED
|
RECOMMEND AT LEAST 100. N can be NO less than 50
|
|
f parametric statistic as ANOVA
|
variance in DV accounted for by IV's in relation to unaccounted for (are there IV's missing from the model that could explain the DV better)
|
|
beta coefficient
|
indication of importance of particular IV's in prediction of a DV indicates magnitude of importance ad direction of relationship
TESTED FOR SIG using a T-TEST |
|
crosstabulations/contingency tables
|
compare distributions of frequencies within particular categorical conditions
display a relationship between 2 or more dichotomous variables |