Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
60 Cards in this Set
- Front
- Back
ANOVA
|
Analysis of Variance
|
|
what is ANOVA used for
|
Used as a test of means for two or more populations
|
|
(T/F) The null hypothesis in an ANOVA is usually that all means are not equal
|
FALSE: all means are equal
|
|
ANOVA must have a ____ that is metric (measured using an interval or ratio scale)
|
Dependent Variable
|
|
ANOVA must contain one or more independent variables that are ____
|
Categorical (non-metric)
|
|
What are Categorical Independent Variables also called
|
Factors
|
|
Treatment
|
a particular combination of factor levels or categories
|
|
____ involves only one categorical variable, or a single factor
|
One-Way ANOVA
|
|
If two or more factors are involved in an ANOVA, the analysis is termed ____
|
N-Way ANOVA
|
|
If a set of independent variables consists of both categorical and metric variables, the technique is called ____
|
Analysis of Covariance (ANCOVA)
|
|
The metric-independent variables are known as ____
|
Covariates
|
|
Assumptions of ANOVA
|
(1) The error term is normally distributed with a zero mean
(2) The error term has a constant variance (3) The error is not related to any of the categories of X (4) The error terms are uncorrelated; if the error terms are correlated, the F ratio can be distorted |
|
Product Moment Correlation
|
(denoted as r) Summarizes the strength of association between two metric (interval or ratio scaled) variables
|
|
Product moment correlation is only valid when the data is ____
|
Linear
|
|
Regression analysis
|
Examines associative relationships between a metric dependent variable and one or more independent variable
|
|
Ways to examine Regression Analysis
|
(1) determine whether a relationship exists
(2) determine the strength of the relationship (3) determine the structure/form of the relationship (4) predict the values of the dependent variable (5) control for other independent variables when evaluating the contributions of a specific variable |
|
___ is the slope obtained by the regression of Y on X when the data are standardized (also termed the beta coefficient or beta weight)
|
Standardized Regression Coefficient
|
|
____ is the distance of all the points from the regression line are squared and added together
|
Sum of Squared Errors
|
|
____ is a plot of the values of two variables
|
Scattergram
|
|
Standardization
|
the process by which the raw data are transformed into new variables having a mean of 0 and a variance of 1
|
|
____ is the strength of association that is measured by R^2
|
Coefficient of Multiple Determination
|
|
Residual
|
the difference between the observed value of Y(i) and the value predicted by the regression equation Y hat(sub i)
|
|
____ arises when intercorrelations among the predictors are very high
|
Multicollinearity
|
|
Multicollinearity can result from what problems
|
(1) the partial regression coefficients may not be estimated precisely (the standard errors are likely to be high
(2) It becomes difficult to assess the relative importance of the independent variables in explaining the variation in the dependent variable |
|
____ is a class of procedures used for data reduction and summarization
|
Factor Analysis
|
|
Factor Analysis is a ____ technique: no distinction between dependent and independent variables
|
Interdependence
|
|
What is factor analysis used for
|
(1) To identify underlying dimensions that explain the correlations among a set of variables
(2) To identify a new, smaller, set of uncorrelated variables to replace the original set of correlated variables |
|
____ are underlying dimensions in factor analysis that explain the correlations among a set of variables
|
Factors
|
|
in the Factor Analysis Model, the first set of weights are chosen so the first factor explains what
|
the largest portion of the total variance
|
|
in the Factor Analysis Model, the second set of weights can be selected so the second factor explains most of what
|
the residual variance, subject to being uncorrelated with the first factor
|
|
Statistics associated with factor analysis
|
Barlett's test of sphericity; Correlation matrix; Communality; Eignvalue; Factor of loadings; Factor matrix; Factor scores; KMO measure of sampling adequacy; Percentage of variance; Screen plot
|
|
____ is used to test the hypothesis that the variables are uncorrelated in the population
|
Barlett's test of sphericity
|
|
____ is a lower triangle matrix showing the simple correlations between all possible pairs of variables including the analysis
|
Correlation Matrix
|
|
____ is the amount of variance a variable shares with all the other variables
|
Communality
|
|
Eigenvalue represents what
|
the total variance explained by each factor
|
|
____ are correlations between the variables and the factors
|
Factor Loadings
|
|
____ contains the factor loadings of all the variables on the factors
|
Factor Matrix
|
|
____ are composite scores estimated for each respondent on the derived factors
|
Factor Scores
|
|
KMO Sampling is used for what
|
to examine the appropriateness of factor analysis
|
|
____ is the percentage of the total variance attributed to each factor
|
Percentage of Variance
|
|
____ is the plot of the Eigenvalues against the number of factors in order of extraction
|
Screen Plot
|
|
Factor Analysis process
|
(1) Problem formulation
(2) Construction of the Correlation matrix (3) Method of factor analysis (4) Determination of number of factors (5) Rotation of factors (6) Interpretation of Factors (7) Calculation of factor scores (8) Determination of model fit |
|
In ____, the total variance in the data is considered. This method of factor analysis is used to determine the minimum number of factor that will account for the maximum variance in the data
|
Principal Components Analysis
|
|
In ____, the factors are estimated based only on the common variance
-Commonalities are inserted in the diagonal of the correlation matrix -Used to identify the underlying dimensions and when the common variance is of interest |
Common Factor Analysis
|
|
____ a plot of the Eigenvalues against the number of factors in order of extraction; the point at which the scree begins to denote the true number of factors
|
Determination Based on Scree Plot
|
|
Describe the results of the principal component analysis
|
-the lower left triangle is the correlation matrix;
-the diagonal has the communalities; -the upper right has the residuals between the observed correlations and the reproduced correlations |
|
____ is used to classify objects into homogeneous groups called clusters
|
Cluster Analysis
|
|
(T/F) Both cluster analysis and discriminant analysis are concerned with classification
|
True
|
|
Does discriminant analysis require prior knowledge of group membership
|
Yes
|
|
Cluster Analysis Process
|
(1) formulate the problem
(2) select a distance measure (3) select a clustering procedure (4) decide on the number of clusters (5) interpret and profile clusters (6) assess the validity of clustering |
|
what is the most commonly used measure of similarity in the cluster analysis process
|
Euclidean Distance
|
|
Hierarchical Clustering Methods
|
Agglomerative Clustering and Divisive Clustering
|
|
____ is characterized by the development of a hierarchy or tree-like structure
|
Hierarchical Clustering
|
|
____ starts with each object in a separate cluster
|
Agglomerative Clustering
|
|
How are clusters formed in agglomerative clustering
|
by grouping objects into bigger and bigger clusters
|
|
____ starts with all the objects grouped in a single cluster
|
Divisive Clustering
|
|
In divisive cluster, clusters are ____ until each object is in a separate cluster
|
Divided or Split
|
|
Hierarchical Agglomerative Clustering-Linkage Method
|
Single Linkage; Complete Linkage; Average Linkage
|
|
The _____ method is based on minimum distance or the nearest neighbor rule
|
Single Linkage
|
|
The ____ method is based on the maximum distance or the furthest neighbor approach
|
Complete Linkage
|