• Shuffle
Toggle On
Toggle Off
• Alphabetize
Toggle On
Toggle Off
• Front First
Toggle On
Toggle Off
• Both Sides
Toggle On
Toggle Off
• Read
Toggle On
Toggle Off
Reading...
Front

### How to study your flashcards.

Right/Left arrow keys: Navigate between flashcards.right arrow keyleft arrow key

Up/Down arrow keys: Flip the card between the front and back.down keyup key

H key: Show hint (3rd side).h key

A key: Read text to speech.a key

Play button

Play button

Progress

1/67

Click to flip

### 67 Cards in this Set

• Front
• Back
 is a method of summarizing data. Frequency is how often something occurs. For a distribution, you define two or more equivalent classes and count the number of observations in each class. A table showing the equivalence classes and the frequency with which their score values occur is called a frequency distribution. Frequency distribution the frequency of an individual score value ungrouped frequency distribution each class interval spans 2 or more score values. It has a nominal upper and lower limit grouped frequency distribution extend .5 below the lower and upper nominal limits and show that there are no gaps in a distribution real limits real upper limit minus the real lower limit for grouped and ungrouped frequency class interval size. - are distributions that show the proportion or percentage of the total number of scores. Proportionate frequency (Prop f) Prop f = f/n Percentage frequency (%f) % f = f/n x 100 Relative frequency distributions show the number, proportion or percentage of scores that occur below the upper limit of each class interval. Cumulative frequency distributions is similar to a bar graph but is used for quantitative variables. It is constructed by erecting vertical bars over the real limits of each class interval with the height of each bar corresponding to the number of scores in the interval. The bars of adjacent class intervals should touch to emphasize the continuous quantitative character of the class intervals. Histogram a way to show the information in a frequency table. It looks a little bit like a line graph. You just need to plot a few points and then join the points by straight lines. So what points do you need to plot? Well, first you have to find the midpoints of each class. The midpoint of a class is the point in the middle of the class. Frequency polygon a convenient way of graphically depicting groups of numerical data through their five-number summaries: the smallest observation (sample minimum), lower quartile (Q1), median (Q2), upper quartile (Q3), and largest observation (sample maximum). May also indicate which observations, if any, might be considered outliers. Box-and-whisker plot another way to analyze the frequency distribution table. Unlike a frequency distribution which tells you how many data points are within each class, a cumulative frequency tells you how many are less than or within each of the class limits. Cumulative frequency distribution (ogive)- resembles a histogram that has been turned on its side. It is a way of summarizing a set of data measured on an interval scale. It is often used in exploratory data analysis to illustrate the major features of the distribution of the data in a convenient and easily drawn form. It is similar to a histogram but is usually a more informative display for relatively small data sets (<100 data points). It provides a table as well as a picture of the data and from it we can readily write down the data in order of magnitude, which is useful for many statistical procedures. Stem-and-leaf plot is the value of a variable below which a certain percent of observations fall. Percentile part of statistics that organizes and summarizes data so that it can be more readily comprehended. Descriptive statistics where the concept of statistical significance is based. You are trying to reach conclusions that extend beyond the immediate data alone. Try to infer from the sample data what the population might think. Inferential statistics is some specific characteristic of a subject that can assume one or more different values. Variable a particular subject’s relative standing on a quantitative variable, or a subject’s classification within a classification variable. In some cases this is a score. Different states in which a variable occurs. Value the individual subjects/objects that serve as the source of the data. Usually this can be a person. Observational unit is the process of assigning numbers or labels to characteristics of people, objects, or events according to a set of rules. Measurement - is a classification system that places people, objects or other entities into mutually exclusive categories. Not meaningfully ordered. Nominal represent the rank order of the subjects with respect to the variable that is being assessed. Characteristic is used to order individuals. Ordinal equal distances between scale values have equal quantitative meaning. This scale does NOT have a true zero point. Characteristic that is used to order individuals, and the distance between numbers are equal. Interval equal distances between scale values have equal quantitative meaning and have a true zero point. Characteristic is used to order individuals, the distance between numbers are equal, and the 0 is meaningful. Ratio Range consists of an uncountably infinite number of values. Characteristics on which individuals can theoretically take on any value between the lowest and highest points on a scale. Continuous Range consist of only a finite number of values or an infinite number of values that can be counted. Characteristics on which individuals can take ona limited number of values. Discrete Information based on characteristics of the entities studied. Data a characteristic which is the same for all members studied. Constant - a characteristic which takes on a different value fot different individuals studied. Variable the score value on which a distribution centers, often called the average. Mode, Mean and Median are all measures of central tendency. Central tendency the sum of the scores divided by the number of scores, commonly known as the average Mean (μ, x̄) the middle score when scores have been arranged in order of size. If the population is even, the median is the midway point between the two sencter scores. Median (Md) the score or qualitative category that occurs with the greatest frequency Mode (Mo) the spread or scatter of scores around the central point and are expressed in terms of distance along a distributions X axis. Variability/Dispersion the distance between the largest and smallest number. Range one half of the distance between the first quartile point and the third quartile point. The semi-interquartile range is a measure of spread or dispersion. It is computed as one half the difference between the 75th percentile [often called (Q3)] and the 25th percentile (Q1). The formula for semi-interquartile range is therefore: (Q3-Q1)/2. With a normal distribution, this will contain half the scores Semi-interquartile range (Q)- represents the sum of squared differences from the mean and is an extremely important term in statistics. Sum of squares (SS) the variance is used as a measure of how far a set of numbers are spread out from each other. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean (expected value). In particular, the variance is one of the moments of a distribution. In that context, it forms part of a systematic approach to distinguishing between probability distributions. Variance (σ2 , s2 ) It shows how much variation or "dispersion" there is from the average (mean, or expected value). Standard Deviation (σ , s) - a symmetrical curve representing the normal distribution. In statistics, the theoretical curve that shows how often an experiment will produce a particular result. The curve is symmetrical and bell shaped, showing that trials will usually give a result near the average, but will occasionally deviate by large amounts. The width of the “bell” indicates how much confidence one can have in the result of an experiment — the narrower the bell, the higher the confidence. Normal curve is a measure of the asymmetry of the probability distribution of a real-valued random variable. The skewness value can be positive or negative, or even undefined. Qualitatively, a negative skew indicates that the tail on the left side of the probability density function is longer than the right side and the bulk of the values (possibly including the median) lie to the right of the mean. A positive skew indicates that the tail on the right side is longer than the left side and the bulk of the values lie to the left of the mean. A zero value indicates that the values are relatively evenly distributed on both sides of the mean, typically but not necessarily implying a symmetric distribution. Skewness is a measure of the "peakedness" of the probability distribution of a real-valued random variable, although some sources are insistent that heavy tails, and not peakedness, is what is really being measured by kurtosis.[1] Higher kurtosis means more of the variance is the result of infrequent extreme deviations, as opposed to frequent modestly sized deviations. A high kurtosis distribution has a sharper peak and longer, fatter tails, while a low kurtosis distribution has a more rounded peak and shorter thinner tails. Kurtosis Distributions with zero excess kurtosis like normal distribution mesokurtic, or mesokurtotic A distribution with positive excess kurtosis leptokurtic, or leptokurtotic. "Lepto-" means "slender". A distribution with negative excess kurtosis. a lower, wider peak around the mean platykurtic, or platykurtotic. "Platy-" means "broad" scores that have been standardized to have a mean of zero and a SD of 1. It indicates how many standard deviations a raw score is above or below the mean. Z-score an observation that is numerically distant from the rest of the data Outlier restricting/ not using outliers Restriction of range non linear Curvilinearity is used to describe the degree of agreement between paired data that are in the form of ranks S Spearman Rank Correlation r sub s the coefficient is the measure of the linear relationship between two variables, X and Y. denoted by rxy or ρ for population Pearson Product-Moment Correlation is a measure of how much two variables change together Covariance ( , ) xy xy σ s the product of the two deviations in calculating Pearson’s r—(deviation for each X x deviation of each Y) Cross-product the representation of the joint frequency of two variables Bivariate plot the number that summarizes the nature of the relationship between two variables Correlation Coefficient - the relationship of two variables Correlation smallest possible squared residual least squares criterion this line gives us the smallest sum of squared residuals line of best fit indicates the change in y for a one unit change in X slope the difference between observed and predicted values. the error (e) or unexplainable part of Y. residual linear relationship between an independent variable and a dependent variable Regression equation a point where the graph of a function (line) intersects with the y-axis y-intercept the value predicted based on the regression equation Y' is regression SS plus residual SS SStotal sum of squares of the differences between the values of y predicted by equation 1 and the actual values of y. Residual SS Total SS - Residual SS Regression SS the proportion of the variability in the outcomes that is accounted for by the predictor variable Coefficient of Determination R squared the proportion of the variability in the outcomes that is not explained by the predictor variable Coefficient of Alienation (1-r squared)