GB Exam 2

studied byStudied by 0 people
0.0(0)
get a hint
hint

What is supervised learning

1 / 49

Tags and Description

UW Madison Gen Bus 307 - Spring 2024: Exam 2

50 Terms

1

What is supervised learning

regroups methods to attempt to learn about distributions where the variables that can be split into categories

New cards
2

What are X variables

explanatory variables, predictors, regressors, independent

New cards
3

What are Y variables

outcomes, response variables, labels, dependent

New cards
4

what is a fitted regression equation

quantifies a linear relationship between two variables, y= intercept + slope * X

New cards
5

Log Liklihood, Equation? When is it used? Higher or lower?

higher is better, discrete y cases, P(Y test/ X test)

New cards
6

Mean Absolute Error, Equation? Lower or Higher?

lower, (1/n)E I Yi - Yhat I

New cards
7

Mean Absolute Percentage Error, Equation? Lower or Higher?

lower, (1/n)E I (Yi-Yhat)/(Yi) I

New cards
8

Root Mean Square Error, higher or lower, outliers, equation?

lower, greatly prenalized by outliers, sqr root((1/n)E(Yi-Yhat)²)

New cards
9

what is R²

how much accurately we can estimate the outcome variable with the explanatory variable, R²= 1-(SSErr-SSTot)

New cards
10

what is SSErr

sum of squared error from the regression, Represents the total amount of variation that we can’t explain with our regression, SSE trend line were plotted at the average = SST (Sum of Squares in Total)

New cards
11

how do you maximize R²

minimize the SSE loss

New cards
12

What is the range of R²

  • Closer to 1 = explain a lot of the variations in Y with our regression

  • Closer to 0 = can’t explain the variations in Y better with our regression

New cards
13

how do you interpret the slope

On average, an increase in study time by 1 hour is associated with an increase in grade by 5.2 points, everything else being equal.

New cards
14

how do you interpret the coefficient

On average, when a student spent 0 hours studying and skipped 0 classes, we expect their grade to be 57 points, everything else being equal.

New cards
15

what are p-value

how likely our data has no effect/relationship, low p-value = more confidence

New cards
16

What is OLS and what does it assume?

ordinary lease square regression, relationship between X & Y is linear, estimates are predictions are denoted with a hat, coefficient are obtained by minimizing the sum of squared residuals

New cards
17

what do you do when x=0 doesn’t make sense

could be outside range of data or unrealistic, or both then extrapolate

New cards
18

when are p-values significant?

Statistically significant at a confidence level if p-value < alpha

New cards
19

Generalized Linear Models

Extends the linear regression approach by allowing the distribution to be non-normal

New cards
20

for change of units when the variable is in log the change becomes ____? and if the varaible is standardized?

becomes % and standard deviations

New cards
21

how do you interpret R²

we can explain 24.5% of the variations in grades by looking at the variations in both the number of hours of study and in the number of class skipped

New cards
22

what is LINE?

linearity, independence, normality (errors), equal variance

New cards
23

how does GLM extend linear regression?

allows distribution to be non-normal, the mean Y to be function of a linear combination of Xs

New cards
24

what is the inverse of the mean function?

link function

New cards
25

the link identity what is it used for

linear relationships

New cards
26

what link log used for

when the mean needs to be positive

New cards
27

what link power used for

cured relationships

New cards
28

choosing the right distribution for continuous Y what is the normal distribution

a lot of averages, bell shaped, can be negative

New cards
29

choosing the right distribution for continuous Y what is the gamma distribution

a lot of times, potentiall skewed, always positive

New cards
30

choosing the right distribution for continuous Y what is the bernoulli distribution

probability of an event happening, binary, either 0 or 1

New cards
31

choosing the right distribution for continuous Y what is the poisson distribution

used for a lot of counts, positive integers

New cards
32

what is akaike information criterion

For cases with different number of variables across models, lower is better

New cards
33

what is overfitting?

the model is too flexible, great fit on training data, poor fit on new data

New cards
34

what is underfitting?

not flexible enough, poor fitting on training and new data

New cards
35

consequences of underfitting

bias, poor prediction performance, inability to capture the complexity of some patterns

New cards
36

what is regularization?

restricting the flexibility of a model

New cards
37

how do you regularize a dataset

estimate on a training set, adjust on a validation set, test prediction performance with a test set.

New cards
38

what do you do with too many variables?

use dimension reduction, solve overfitting issues, interpretation is still difficult, keep extra variables with variables selection

New cards
39

what is lasso?

Method where variable selection is performed through regularization. It shrinks the coefficients towards 0

New cards
40

what does 𝜆 control?

the strength of regularization, if 𝜆 is large the coefficient will be different from 0 𝜆 controls𝜆 controls

New cards
41

what are the drawback to lasso?

sensitive to x, issues with small datasets, scale sensitivity, loss of interpretability, bias

New cards
42

decision trees

create groups based on thresholds on X values

New cards
43

what are the advantages of decision trees?

don’t need to specify the relation between x and y, works for regression and classification, very easy to explain, mirrors decision making, graphs

New cards
44

what are the disadvantages of decision trees?

don’t have the same prediction accuracy as other methods

New cards
45
New cards
46
New cards
47
New cards
48
New cards
49
New cards
50
New cards

Explore top notes

note Note
studied byStudied by 13 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 8 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 5 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 17 people
Updated ... ago
4.5 Stars(2)
note Note
studied byStudied by 6 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 11 people
Updated ... ago
5.0 Stars(2)
note Note
studied byStudied by 4 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 18 people
Updated ... ago
4.5 Stars(2)

Explore top flashcards

flashcards Flashcard187 terms
studied byStudied by 24 people
Updated ... ago
4.0 Stars(1)
flashcards Flashcard59 terms
studied byStudied by 16 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard65 terms
studied byStudied by 22 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard53 terms
studied byStudied by 26 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard105 terms
studied byStudied by 13 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard53 terms
studied byStudied by 39 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard103 terms
studied byStudied by 31 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard44 terms
studied byStudied by 45 people
Updated ... ago
5.0 Stars(1)