• Shuffle
    Toggle On
    Toggle Off
  • Alphabetize
    Toggle On
    Toggle Off
  • Front First
    Toggle On
    Toggle Off
  • Both Sides
    Toggle On
    Toggle Off
  • Read
    Toggle On
    Toggle Off
Reading...
Front

Card Range To Study

through

image

Play button

image

Play button

image

Progress

1/36

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

36 Cards in this Set

  • Front
  • Back
Characteristics of Data

Center


Variation


Distribution


Outliers


Time

Center
a representative value that indicates where the middle of the data set is located (the average or mean or median)
Variation
a measure of the amount that the data values vary
distribution
the nature or shape of the spread of the data over the range of values (such as a bell shape)
outliers
sample values that lie very far away from the vast majority of the other samples
time
any change in the characteristics of the data over time
a frequency distribution/frequency table
shows how data are partitioned among several catagories (or classes) by listing the categories along with the number (frequency) of data values in each of them
lower class limits
are the smallest numbers that can belong to the different classes (15,19,23,27,31)
upper class limits
are the largest numbers that can belong to the different classes (18,22,26,30,34)
class boundaries
are the numbers used to separate the classes, but without the gaps created by class limits (14.5,18.5,22.5,26.5,30.5,34.5)
class midpoints
are the values in the middle of the classes (16.5,20.5,24.5,28.5,32.5)
class width
is the difference between two consecutive lower class limits (or two consecutive lower class boundaries) (19-15) = 4 (23-19) = 4
NOTE - the class width is found by subtracting the minimum value of the data from the maximum value of the data and then dividng by the number of classes NOTE: (The minimum amount of classes is 5)\
CLASS WIDTH = (34-15)/5 = 19/5 = 3.8
Relative frequency distribution
is when the class frequency gets replaced by relative frequency (decimal or percent)
Normal Distribution
1) the data frequencies start low then increase to one or two high frequencies, and then decrease to a low frequency
2) The distribution is approximately symmetric, with frequencies preceding the maximum being rougly a mirror image of those that follow the maximum
Histogram
Histogram = is a graph consisting of bars of equal width drawn adjacent to each other. The horizontal scale represents classes of quantitative data values and the vertical scale represents frequencies. The heights of the bars correspond to the frequency values
NOTE - the horizontal scale can be labeled with lower class limits, class boundaries and class mid-points
Scatterplot
is a plot of paired (x,y) quantitative data with horizontal x-axis and a verticle y-axis (Used to determineif there is a coorelation between the X and the Y data values)
Times series graph
is a graph of time-series data, which are qunatitative data that have been collected at different points in time, such as monthly or yearly (use when measurements are taken over a period of time)
a dot plot
a dot plot consists of a graph in which each data value is plotted as a point (or dot) along a horizontal scale of values. Dots representing equal values are stacked
stemplot
Bar Graph
uses bars of equal width to show frequencies of categories of categorical data
Pareto chart
is a bar graph that the frequencies are organized in decending order (use when you want to arrange the frequencies in decending order
pie chart
(TRY NOT TO USE PIECHARTS BECAUSE THEY DO NOT REPRESENT ACCURATE DIFFERENCES BETWEEN FREQUENCIES)- is a graph that depicts categorical data as slices of a circle, in which the size of each slice i proportinal to th e frequency count for the category
Frequency polygon
uses line segments connected to points located directly above class midpoint values. A frequency polygon is very similar to a histogram
Ogive
depicts cumulative frequencies
Graphs that decieve
Nonzero graphs, picographs
CVDOT
Center(median, mode, midrange), Variations(how much does the data spread), Distribution (normal,Skewd Distribution(right skewd distribution, left(lower skewd distribution), Outliers (value far away from the rest), Time
measure of center
a measure of center is a value at the center or middle of a data set.
Mean/arithmetic mean
of a data set is the measure of center found by adding the data values and dividing the total by the number of data values. Mean = (sum of all data values/number of data values)
the median
the Median - of a data set is the measure of center that is the middle value when the original data values are arranged in order of increasing(or decreasing) magnitude. ODD NUMBER = NUMBER IN MIDDLE, EVEN = TWO MIDDLE NUMBER DIVIDE BY 2
mode
the MODE of a data set is the value that occurs with the greatest frequency, (2 = bimodal, 2+ = multimodal, when no data repeated = no mode
mid range
the mid range of a data set is found by adding the maximum data value to the minimum values then dividing the sum by 2 (NOTE - the range = max value - min value)
RAnge
the range of a set of data values is the difference between the maximum data value and the minimum data value
The standard deviation of a set of sample value
denoted by s, is a measure of how much data values deviate away from the mean.
variance
of a set of values is a measurement of variation and is equal to the square of the standard deviation
COEFFICIENT OF VARIATION
for a set of data is expressed as a percent, describes the standard deviation relative to the mean, and is given by the following