Statistics Vocabulary Ch. 1-3

studied byStudied by 6 people
5.0(1)
get a hint
hint

Data

1 / 93

Tags & Description

Studying Progress

0%
New cards
94
Still learning
0
Almost done
0
Mastered
0
94 Terms
1
New cards

Data

Collections of observations, such as measurements, genders, or survey responses

New cards
2
New cards

Statistics

The science of planning studies and experiments; obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data

New cards
3
New cards

Population

the complete collection of all measurements or data that are being considered

New cards
4
New cards

Census

the collection of data from every member of the population

New cards
5
New cards

Sample

Subcollection of members selected from a population

New cards
6
New cards

Voluntary Response Sample

one in which the respondents themselves decide whether to be included

New cards
7
New cards

Parameter

a numerical measurement describing some characteristic of a population

New cards
8
New cards

Statistic

a numerical measurement describing some characteristic of a sample

New cards
9
New cards

Quantitative Data

Data consisting of numbers representing counts or measurements

New cards
10
New cards

Qualitative (Categorial data)

Data consisting of names or labels (not numbers that represent counts or measurements)

New cards
11
New cards

Discrete Data

result when the data values are quantitative and the number of values is finite or "countable"

New cards
12
New cards

Continuous Data

result from infinitely many possible quantitative values, where the collection of values is not countable

New cards
13
New cards

Nominal Level of Measurement

characterized by data that consist of names, labels, or categories only. The data can not be arranged in an ordering scheme (such as low to high)

New cards
14
New cards

Ordinal Level of Measurement

data that can be arranged in some order, but differences (obtained by subtraction) between data values either can not be determined or are meaningless

New cards
15
New cards

Interval Level of Measurement

Data that can be arranged in order, and differences between data values can be found and are meaningful. Data at the _____ Level does NOT have a natural zero starting point at which none of the quantity is present.

New cards
16
New cards

Ratio Level of Measurement

Data that can be arranged in order, differences can be found and are meaningful, and there IS a natural zero starting point

New cards
17
New cards

Big Data

Data sets that are too large and so complex that their analysis is beyond the capabilities of traditional software tools. Analysis of _____ may require software simultaneously running in parallel on many different computers

New cards
18
New cards

Data Science

Involves applications of statistics, computer science, and software engineering, along with some other relevant fields (such as sociology or finance).

New cards
19
New cards

Missing Completely at Random

A data value is missing completely at random if the likelihood of its being missing is independent of its value or any of the other values in the data set. That is, any data value is just as likely to be missing as any other data value.

New cards
20
New cards

Missing Not at Random

A data value is missing not at random if the missing value is related to the reason that it is missing.

New cards
21
New cards

Placebo

A harmless and ineffective pill, medicine, or procedure sometimes used for psychological benefit or sometimes used by researchers for comparison to other treatments

New cards
22
New cards

Experiment

in an experiment, we apply some treatment and then proceed to observe its effects on the individuals. (these individuals are referred to as experimental units, and often called subjects when they are people)

New cards
23
New cards

Observational Study

observe and measure specific characteristics, but we don't attempt to modify the individuals being studied

New cards
24
New cards

Replication

Repetition of an experiment on more than one individual

New cards
25
New cards

Blinding

Used when the subject doesn't know whether he or she is receiving a treatment or a placebo

New cards
26
New cards

Placebo Effect

Used when individuals are assigned to different groups through a process of random selection

New cards
27
New cards

Double Blinding

the act of blinding both the subjects of an experiment and the researchers who work with the subjects.

New cards
28
New cards

Confounding

occurs when we can see some effect, but we can not identify the specific factor that caused it.

New cards
29
New cards

Simple Random Sample

A sample of size n selected from the population in such a way that each possible sample of size n has an equal chance of being selected.

New cards
30
New cards

Random Sample

has a weaker requirement (as compared to a simple random sample) that all members of the population have the same chance of being selected

New cards
31
New cards

Systematic Sampling

we select some starting point and then select every kth (such as every 50th) element in the population

New cards
32
New cards

Convenience Sampling

we simply use data that is very easy to get

New cards
33
New cards

Stratified Sampling

we subdivide the population into at least two different subgroups (or strata) so that subjects within the same subgroup share the same characteristics (such as gender). Then we draw a sample from each subgroup

New cards
34
New cards

Cluster Sampling

we first divide the population area into sections (or clusters). Then we randomly select some of those clusters and choose all the members from those selected clusters.

New cards
35
New cards

Cross-Sectional Study

data are observed, measured, and collected at one point in time

New cards
36
New cards

Retrospective Study

data are collected from a past time period by going back in time (through examination of records, interviews, and so on)

New cards
37
New cards

Prospective (Longitudinal Study)

data are collected in the future from groups that share common factors

New cards
38
New cards

Sampling Error

occurs when the sample has been selected with a random method, but there is a discrepancy between a sample result and the true population result; such an error results from chance sample fluctuations

New cards
39
New cards

Non-Sampling Error

the result of human error, including such factors as wrong data entries, computing errors, questions with biased wording, false data provided by respondents, forming biased conclusions, or applying statistical methods that are not appropriate for the circumstances

New cards
40
New cards

Nonrandom Sampling Error

the result of using a sampling method that is not random, such as using a convenience sample or a voluntary response sample

New cards
41
New cards

Statistically significant result

one that is very unlikely to occur by chance

New cards
42
New cards

Lower Class Limit

End value of a class limit.

New cards
43
New cards

Upper Class Limits

Beginning value of a class limit.

New cards
44
New cards

Class Boundaries

the numbers used to separate the classes, but without the gaps created by class limits. (The numbers between classes, Ex. Class : 10 - 19 , boundaries = 9.5, 19.5 )

New cards
45
New cards

Class Midpoint

the values in the middle of the classes. (Upper Class Limit + Lower Class Limit / 2)

New cards
46
New cards

Class Width

the difference between two consecutive lower class limits

New cards
47
New cards

Frequency Table (Distribution)

shows how data are partitioned among several categories (or classes) by listing the categories along with the number (frequency) of data values in each of them.

New cards
48
New cards

Relative Frequency Distribution

the same shape and horizontal scale as a histogram, but the vertical scale uses relative frequencies (as percentages or proportions) instead of actual frequencies.

New cards
49
New cards

Cumulative Frequency Distribution

A variation of the basic frequency distribution, in which the frequency for each class is the sum of the frequencies for that class and all previous classes.

New cards
50
New cards

Histogram

A graph used to show frequency distributions of data points of one variable. (Bar Graph that touches, Each bar sits within the boundaries of each class)

New cards
51
New cards

Relative Frequency Histogram

A Histogram that measures the vertical scale on Frequency Percentages % instead of #'s

New cards
52
New cards

Normal Distribution

a distribution that is symmetric about the mean, showing that data near the mean are more frequent in occurrence than data far from the mean. (Bell Shaped)

New cards
53
New cards

Skewed Right Distribution

a distribution that is not symmetrical and extends to one side more than to the other. The tail is on the right side

New cards
54
New cards

Skewed Left Distribution

a distribution that is not symmetrical and extends to one side more than to the other. The tail is on the left side.

New cards
55
New cards

Uniform Distribution

a type of distribution in which all different possible values occur with approximately the same frequency, so the heights of the bars in the histogram are approximately uniform

New cards
56
New cards

DotPlot

a graph of quantitative data in which each data value is plotted as a point (or dot) above a horizontal scale of values. Dots representing equal values are stacked.

New cards
57
New cards

Stem-and-Leaf Plot

represents quantitative data by separating each value into two parts: the stem (such as the leftmost digit, 10's) and the leaf (such as the rightmost digit, 1's). Can reconstruct data sets from graph

New cards
58
New cards

Time-Series Graph

a graph of time-series data, which are quantitative data that have been collected at different points in time, such as monthly or yearly.

New cards
59
New cards

Bar Graph

uses bars of equal width to show frequencies of categories of categorical (or qualitative) data. Typically has spaces between bars

New cards
60
New cards

Pareto Chart

a bar graph for categorical data, with the added stipulation that the bars are arranged in descending order according to frequencies, so the bars decrease in height from left to right. (NO spaces between bars)

New cards
61
New cards

Pie Chart

a very common graph that depicts categorical data as slices of a circle, in which the size of each slice is proportional to the frequency count for the category.

New cards
62
New cards

Frequency Polygon

uses line segments connected to points located directly above class midpoint values. A frequency polygon is very similar to a histogram, but a frequency polygon uses line segments instead of bars.

New cards
63
New cards

Relative Frequency Polygon

uses line segments connected to points located directly above class midpoint values but uses relative frequencies (proportions or percentages) for the vertical scale instead.

New cards
64
New cards

Pictographs

Drawings of objects. Data that are one-dimensional in nature (such as budget amounts) are often depicted with two-dimensional objects (such as dollar bills) or three-dimensional objects (such as stacks of dollar bills). By using pictographs, artists can create false impressions that grossly distort differences by using these simple principles of basic geometry.

New cards
65
New cards

Correlation

a relationship that exists between two variables when the values of one variable are somehow associated with the values of the other variable.

New cards
66
New cards

Linear Correlation

exists between two variables when there is a correlation and the plotted points of paired data result in a pattern that can be approximated by a straight line.

New cards
67
New cards

Scatter Plot

is a plot of paired (x, y) quantitative data with a horizontal x-axis and a vertical y-axis. The horizontal axis is used for the first variable (x), and the vertical axis is used for the second variable (y).

New cards
68
New cards

Linear Correlation Coefficient

is denoted by r, and it measures the strength of the linear association between two variables.

New cards
69
New cards

P-Value

is the probability of getting paired sample data with a linear correlation coefficient r that is at least as extreme as the one obtained from the paired sample data.

New cards
70
New cards

Regression Line

is the straight line that "best" fits the scatterplot of the data.

New cards
71
New cards

Descriptive Statistics

summarize or describe relevant characteristics of data

New cards
72
New cards

Inferential Statistics

used to make inferences or generalizations about a population

New cards
73
New cards

Measure of Center

used to measure the center of a data by finding the Mean, Median, Mode, and Midrange

New cards
74
New cards

Mean - (or arithmetic mean)

of a set of data is the measure of center found by adding all of the data values and dividing the total by the number of data values. Also known as the average

New cards
75
New cards

Resistant

if the presence of extreme values (outliers) does not cause it to change very much

New cards
76
New cards

Median

of a data set is the measure of center that is the middle value when the original data values are arranged in order of increasing (or decreasing) magnitude.

New cards
77
New cards

Mode

of a data set is the value(s) that occur(s) with the greatest frequency.

New cards
78
New cards

Bimodal

When two data values occur with the same greatest frequency, each one is a mode

New cards
79
New cards

Multimodal

When more than two data values occur with the same greatest frequency, each is a mode

New cards
80
New cards

No mode

When no data value is repeated

New cards
81
New cards

Midrange

of a data set is the measure of center that is the value midway between the maximum and minimum values in the original data set. It is found by adding the maximum data value to the minimum data value and then dividing the sum by 2

New cards
82
New cards

Variation

Describes the spread of data by finding values of range, variance, and standard deviation

New cards
83
New cards

Range

of a set of data values is the difference between the maximum data value and the minimum data value.

New cards
84
New cards

Standard Deviation

Sample = s, Population = σ. is a measure of how much data values deviate away from the mean.

New cards
85
New cards

Biased Estimator

which means that values of the sample standard deviation s do not tend to center around the value of the population standard deviation σ.

New cards
86
New cards

Unbiased Estimator

which means that values of s^2 tend to center around the value of σ^2 instead of systematically tending to overestimate or underestimate σ^2

New cards
87
New cards

Range Rule of Thumb

Subtract the smallest value in a dataset from the largest and divide the result by four to estimate the standard deviation.

New cards
88
New cards

Variance

of a set of values is a measure of variation equal to the square of the standard deviation.

New cards
89
New cards

Coefficient of Variation (or CV)

for a set of nonnegative sample or population data, expressed as a percent, describes the standard deviation relative to the mean

New cards
90
New cards

Z-Score (or standard score or standardized value)

is the number of standard deviations that a given value x is above or below the mean

New cards
91
New cards

Percentile

are measures of location, denoted which divide a set of data into 100 groups with about 1% of the values in each group

New cards
92
New cards

Quartiles

are measures of location, denoted and which divide a set of data into four groups with about 25% of the values in each group.

New cards
93
New cards

Boxplot (or box-and-whisker diagram)

is a graph of a data set that consists of a line extending from the minimum value to the maximum value, and a box with lines drawn at the first quartile Q1, the median, and the third quartile Q3

New cards
94
New cards

Skewed

if the spread of data is not symmetric and extends more to one side than to the other.

New cards

Explore top notes

note Note
studied byStudied by 16 people
Updated ... ago
5.0 Stars(2)
note Note
studied byStudied by 7 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 35 people
Updated ... ago
5.0 Stars(2)
note Note
studied byStudied by 7 people
Updated ... ago
4.0 Stars(1)
note Note
studied byStudied by 10 people
Updated ... ago
5.0 Stars(2)
note Note
studied byStudied by 1 person
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 14 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 33949 people
Updated ... ago
4.9 Stars(261)

Explore top flashcards

flashcards Flashcard138 terms
studied byStudied by 5 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard232 terms
studied byStudied by 2 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard60 terms
studied byStudied by 60 people
Updated ... ago
5.0 Stars(3)
flashcards Flashcard33 terms
studied byStudied by 8 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard40 terms
studied byStudied by 32 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard59 terms
studied byStudied by 4 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard46 terms
studied byStudied by 9 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard59 terms
studied byStudied by 1481 people
Updated ... ago
4.0 Stars(24)