• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/74

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

74 Cards in this Set

  • Front
  • Back
If two independent large samples are taken from two populations, the sampling distribution of the difference between the two sample means
can be approximated by a normal distribution
An estimate of the variance of a population based on the combination of two sample results is known as the
pooled variance estimate
The pooled variance is appropriate whenever the two populations
are normally distributed and have equal variances
When each data value in one sample is matched with a corresponding data value in another sample, the samples are known as
matched samples
In an analysis of variance, one estimate of σ2 is based upon the differences between the treatment means and the
overall sample mean
The F ratio in a completely randomized ANOVA is the ratio of
MSR/MSE
The variable of interest in an ANOVA procedure is called
a factor
In the ANOVA, treatment refers to
different levels of a factor
An experimental design where the experimental units are randomly assigned to the treatments is known as
completely randomized design
The EPA takes a random sample of 6 readings for each city and find the mean and variance. The populations are assumed to be normal with unknown but equal variances. The EPA should use a
t test for difference in two means with independent data (pooled t test)
the mean square is the sum of squares divided by
its corresponding degrees of freedom
in factorial designs, the response produced when the treatments of one factor interact with the treatments of another in influencing the response variable is known as
interaction
the number of times each experimental condition is observed in a factorial design is known as
replication
assumptions of analysis of variance do not include
equality of means
when variances of two independent samples are combined and S^2 is computed, the S^2 is referred to as
the pooled estimator of o^2
the ANOVA procedure is a statistical approach for determining whether or not the means of
two or more populations are equal
in testing the difference between the means of two normally distributed populations using independent random samples, the pooled estimate of the variance is used if
population variances are assumed to be equal
Regression analysis is a statistical procedure for developing a mathematical equation that describes how
one dependent and one or more independent variables are related
A procedure used for finding the equation of a straight line which provides the best approximation for the relationship between the independent and dependent variables is the
least squares method
Application of the least squares method results in values of the y intercept and the slope which minimizes the sum of the squared deviations between the
observed values of the dependent variable and the estimated values of the dependent variable
Larger values of r2 imply that the observations are more closely grouped about the
least squares line
In a regression model involving more than one independent variable, which of the following tests must be used in order to determine if the relationship between the dependent variable and the set of independent variables is significant?
F test
If the coefficient of determination is equal to 1, then the coefficient of correlation
can be -1 or +1
If the coefficient of determination is a positive value, then the regression equation
could have either a positive or a negative slope
If the coefficient of correlation is 0.8, the percentage of variation in the dependent variable explained by the variation in the independent variable is
64%
In regression analysis, if the dependent variable is measured in dollars, the independent variable
can be any units
If the coefficient of correlation is a positive value, then the slope of the regression line
must also be positive
If the coefficient of correlation is a negative value, then the coefficient of determination
must be positive
If all the points of a scatter diagram lie on the least squares regression line, then the coefficient of determination for these variables based on this data is
1
A multiple regression model has
more than one independent variable
Compared to the confidence interval estimate for a particular value of y (in a linear regression model), the interval estimate for an average value of y will be
narrower
A variable that can not be measured in terms of how much or how many but instead is assigned values to represent categories is called
a qualitative variable
the error term is the difference between an individual value of the dependent variable and the corresponding mean value of the dependent variable...t/f
false
the point estimate of the variance in a regression model is
MSE
what measures the strength of the linear relationship between the dependent and the independent variable
correlation coefficient
what is a violation of one of the major assumptions of the simple regression model?
as the value of x increases, the value of the error term also increases
the relationship between the dependent variable and the independent variable is stronger when the r^2 is __ and the s (standard error) is __
higher, lower
in a regression model, a value of the error term depends upon other values of the error term...t/f
false
when using a multiple regression model, we assume that error terms (residuals) are distributed according to
normal distribution
the multiple coefficient of determination is the __ divided by the total variation
explained variation
the residual is the difference between the observed value of the dependent variable and the predicted value of the dependent variable...t/f
true
in a simple linear regression model, the coefficient of determination not only indicates the strength of the relationship between independent and dependent variable, but also shows whether the relationship is positive or negative...t/f
false
when using simple linear regression, we would like to use confidence intervals for the __ and prediction intervals for the __ at a given value of x
mean y-value, individual y-value
in a simple linear regression analysis, the correlation coefficient and the slope always have the same sign...t/f
true
what are assumptions of the error terms in a simple linear regression model
errors are normally distributed, error terms have a mean of zero and have a constant variance
if all data points fell on a straight line, SSE would equal __ and r would equal __
0, -1 or 1
a t-test is used when testing the significance of an individual independent variable...t/f
true
if it is desired to include marital status in a multiple regression model by using the categories: single, married, separated, divorced, widowed, what will be the effect on the model?
four more independent variables will be included
A nonparametric method for determining the differences between two populations based on two matched samples where only preference data is required is the
wilcoxon signed rank test
Statistical methods that require assumptions about the population are known as
parametric
The Spearman rank-correlation coefficient is
a correlation measure based on rank-ordered data for two variables
A parameter of the exponential smoothing model which provides the weight given to the most recent time series value in the calculation of the forecast value is known as the
smoothing constant
A goodness of fit test is always conducted as a
upper tail test
A statistical test conducted to determine whether to reject or not reject a hypothesized probability distribution for a population is known as a
goodness of fit test
The time series component, which reflects a regular, multi-year pattern of being above and below the trend line is
cyclical
The time series component that reflects variability due to natural disasters is called
irregular
the smoothing constant is a number that determines how much weight it is attached to each observation...t/f
true
the sampling distribution for a goodness of fit test is
the chi-square distribution
a restaurant has been experiencing higher sales during the weekends compared to the weekdays. Daily restaurant sales patterns for this restaurant over a week are an example of what component of a time series
seasonal
the time series component that reflects variability during a single year is called
seasonal
one use of the chi-square goodness of fit test is to determine if specified probabilities in the null hypothesis is correct...t/f
true
in a contingency table, when all the expected frequencies equal the observed frequencies the calculated chi-squared statistic equals 0...t/f
true
a group of observations measured at successive time intervals is known as
a time series
which nonparametric method requires that we carry out a paired difference experiment
wilcoxon signed rank test
one measure of the accuracy of a forecasting model is
the mean square error
when we carry out a chi-square test of independence, the expected frequencies are based on the null hypothesis...t/f
true
exponential smoothing is a forecasting method that applies equal weights to the time series observations...t/f
false
when deseasonalizing a time series observation the actual time series observation is divided by its seasonal factor...t/f
true
statistical methods that generally require very few, if any, assumptions about the population distribution are known as
nonparametric
the time series component that reflects gradual variability over a long time period is called
a trend
if data for a time series analysis is collected on an annual basis only, which component may be ignored
seasonal
when using the chi-square goodness of fit test, if the value of the chi-square statistic is large enough, we reject the null hypothesis...t/f
true
the level of measurement that is simply a label for the purpose of identifying an item is
nominal measurement
statistical methods that generally require the assumptions that population distributions are normal are
parametric