Add linear regression info.
This commit is contained in:
parent
ea00684636
commit
f5c480ad8e
|
@ -18,6 +18,93 @@ https://discuss.codecademy.com/t/what-does-it-mean-that-python-is-an-object-orie
|
||||||
|
|
||||||
SyntaxError example: `SyntaxError: EOL while scanning string literal`
|
SyntaxError example: `SyntaxError: EOL while scanning string literal`
|
||||||
|
|
||||||
|
# Linear Regression
|
||||||
|
|
||||||
|
One of the projects in my Python 3 course on Codecademy was to calculate the linear regression of any given line in a set. This is my journey of doing just that, following their instructions. The goal, is to calculate the bounciness of different balls with the least error possible.
|
||||||
|
|
||||||
|
Starting with `y = m*x + b`
|
||||||
|
|
||||||
|
We can determine the y of a point pretty easily if we have the slope of the line (m) and the intercept (b). Thus, we can write a basic function to calculate the y:
|
||||||
|
|
||||||
|
```
|
||||||
|
def get_y(m, b, x):
|
||||||
|
y = m*x + b
|
||||||
|
return y
|
||||||
|
|
||||||
|
print(get_y(1, 0, 7) == 7)
|
||||||
|
print(get_y(5, 10, 3) == 25)
|
||||||
|
```
|
||||||
|
|
||||||
|
We can then use that to calculate the linear regression of a line:
|
||||||
|
|
||||||
|
```
|
||||||
|
def calculate_error(m, b, point):
|
||||||
|
x_point = point[0]
|
||||||
|
y_point = point[1]
|
||||||
|
|
||||||
|
y2 = get_y(m, b, x_point)
|
||||||
|
|
||||||
|
y_diff = y_point - y2
|
||||||
|
y_diff = abs(y_diff)
|
||||||
|
return y_diff
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
To get a more accurate result, we need a function that will parse several points at a time:
|
||||||
|
|
||||||
|
```
|
||||||
|
def calculate_all_error(m, b, points):
|
||||||
|
totalerror = 0
|
||||||
|
for point in points:
|
||||||
|
totalerror += calculate_error(m, b, point)
|
||||||
|
|
||||||
|
return abs(totalerror)
|
||||||
|
```
|
||||||
|
|
||||||
|
We can test it using their examples:
|
||||||
|
|
||||||
|
```
|
||||||
|
#every point in this dataset lies upon y=x, so the total error should be zero:
|
||||||
|
datapoints = [(1, 1), (3, 3), (5, 5), (-1, -1)]
|
||||||
|
print(calculate_all_error(1, 0, datapoints))
|
||||||
|
|
||||||
|
#every point in this dataset is 1 unit away from y = x + 1, so the total error should be 4:
|
||||||
|
datapoints = [(1, 1), (3, 3), (5, 5), (-1, -1)]
|
||||||
|
print(calculate_all_error(1, 1, datapoints))
|
||||||
|
|
||||||
|
#every point in this dataset is 1 unit away from y = x - 1, so the total error should be 4:
|
||||||
|
datapoints = [(1, 1), (3, 3), (5, 5), (-1, -1)]
|
||||||
|
print(calculate_all_error(1, -1, datapoints))
|
||||||
|
|
||||||
|
|
||||||
|
#the points in this dataset are 1, 5, 9, and 3 units away from y = -x + 1, respectively, so total error should be
|
||||||
|
# 1 + 5 + 9 + 3 = 18
|
||||||
|
datapoints = [(1, 1), (3, 3), (5, 5), (-1, -1)]
|
||||||
|
print(calculate_all_error(-1, 1, datapoints))
|
||||||
|
```
|
||||||
|
|
||||||
|
You can save all possible m and b values between -10 and 10 (m values), as well as -20 and 20 (b values) using:
|
||||||
|
|
||||||
|
```
|
||||||
|
possible_ms = [mv * 0.1 for mv in range(-100, 100)] #your list comprehension here
|
||||||
|
possible_bs = [bv * 0.1 for bv in range(-200, 200)] #your list comprehension here
|
||||||
|
```
|
||||||
|
|
||||||
|
We can find the combination that produces the least error, which is:
|
||||||
|
|
||||||
|
```
|
||||||
|
m = 0.3
|
||||||
|
b = 1.7
|
||||||
|
x = 6
|
||||||
|
```
|
||||||
|
|
||||||
|
The goal was to calculate the bounciness of different balls with the least error possible. With this data, we can calculate how far a given ball would bounce. For example, a 6 cm ball would bounce 3.5 cm. We know this because we can plug in the numbers like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
get_y(0.3, 1.7, 6)
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
# Math (ex6)
|
# Math (ex6)
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
Loading…
Reference in New Issue