ng-多变量回归

多维情况下矢量化

在实际情境中,预测一个y值通常需要很多x值,因此上一节公式可以改写为 fw⃗, b(x⃗) = w1x1 + w2x2 + · · · + wnxn + b 也即 fw⃗, b(x⃗) = w⃗ · x⃗ + b 我们可以定义两个矢量

1
2
3
4
5
import numpy as np
w=np.array([1,2,1,2])
x=np.array([3,4,5,6])
b=2
f=np.dot(w,x)+b # 矢量点乘
这个运算速度是比for循环要快得多的 自然地,我们需要考虑多维情况下如何实现迭代,一维式子改写为 $$w_{j} =w_{j} -\alpha \frac{\partial }{\partial w_{j}}J(\overrightarrow{w},b) $$
$$ b =b -\alpha \frac{\partial }{\partial b}J(\overrightarrow{w},b) $$ 带入J求偏导项 $$ J(\overrightarrow{w},b)=\frac{1}{2m} \sum_{i=1}^{m} (f_{w,x}( \overrightarrow{x^{(i)}})-\overrightarrow{y}^{(i)} )^{2 }$$ $$ \frac{\partial }{\partial w_{j}}J(\overrightarrow{w},b)=\frac{1}{2m} \sum_{i=1}^{m}2(f_{w,x}( \overrightarrow{x^{(i)}})-\overrightarrow{y}^{(i)} )\frac{\partial }{\partial w_{j}} f_{w,x}( \overrightarrow{x^{(i)}})$$ $$= \frac{1}{2m} \sum_{i=1}^{m}2(f_{w,x}( \overrightarrow{x^{(i)}})-\overrightarrow{y}^{(i)} )\frac{\partial }{\partial w_{j}}(\overrightarrow{w} ·\overrightarrow{x^{(i)}}+b) $$ $$= \frac{1}{m} \sum_{i=1}^{m}(f_{w,x}( \overrightarrow{x^{(i)}})-\overrightarrow{y}^{(i)} )x_{j} ^{(i)} $$ 最终得出 $$ w_{j} =w_{j} -\alpha \frac{1}{m} \sum_{i=1}^{m}(f_{w,x}( \overrightarrow{x^{(i)}})-\overrightarrow{y}^{(i)} )x_{j} ^{(i)}$$ 同理可得 $$ b =b -\alpha \frac{1}{m} \sum_{i=1}^{m}(f_{w,x}( \overrightarrow{x^{(i)}})-\overrightarrow{y}^{(i)} ) $$

特征缩放

换坐标(归一化),在物理中经常引入无量纲常数缩放方法,机器学习中的特征缩放就是这一思想,这通常会使得随机梯度下降收敛的更快,同时避免了因为某些过于尖锐的峰处多次震荡才收敛的情况

多变量回归的矩阵表示

对于m组数据,每组数据有n个表示维度的情况,我们可以用下面的矩阵表示数据集

$$ \mathbf{X}=\left(\begin{array}{cccc} x_{0}^{(0)} & x_{1}^{(0)} & \cdots & x_{n-1}^{(0)} \\ x_{0}^{(1)} & x_{1}^{(1)} & \cdots & x_{n-1}^{(1)} \\ \cdots & & & \\ x_{0}^{(m-1)} & x_{1}^{(m-1)} & \cdots & x_{n-1}^{(m-1)} \end{array}\right) $$ 因此式子 fw, b(x) = w0x0 + w1x1 + … + wn − 1xn − 1 + b 可以改写为 fw, b(x) = w ⋅ x + b

作业实现

作业1

描述:You will use the motivating example of housing price prediction. The training dataset contains three examples with four features (size, bedrooms, floors and, age) shown in the table below. Note that, unlike the earlier labs, size is in sqft rather than 1000 sqft. This causes an issue, which you will solve in the next lab!
github地址:https://github.com/kaieye/2022-Machine-Learning-Specialization/blob/main/Supervised%20Machine%20Learning%20Regression%20and%20Classification/week2/1.Multiple%20linear%20regression/C1_W2_Lab02_Multiple_Variable_Soln.ipynb
基本思路就是用pandas读取数据,然后用sklearn库实现多变量回归。 调库

1
2
3
4
5
from sklearn.model_selection import train_test_split  
from sklearn.linear_model import SGDRegressor
from sklearn.metrics import mean_squared_error, r2_score
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
调用方式: train_test_split 函数用于将数据集拆分为训练集和测试集。它通常的调用方式如下:
1
2
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

X 和 y 分别是特征和标签数据。
test_size 参数指定了用作测试集的数据比例(在这个例子中是20%)。
random_state 参数用于确保每次拆分时都能得到相同的结果,以便于重现实验。
我门现在读取并划分数据
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 加载数据  
data_path=
data = pd.read_csv('')

# 假设列名是这样的(根据你的实际数据进行调整)
features = ['Number of Bedrooms', 'Number of Floors', 'Age of Home']
target = ['Price']

# 分离特征和目标变量
x = data[features]
y = data[target].values.ravel() # ravel将DataFrame转换为一维数组

# 划分训练集和测试集
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)
然后训练
1
2
3
4
5
6
7
# 实例化并拟合模型
sgd_reg = SGDRegressor(max_iter=1000, tol=1e-3)
sgd_reg.fit(x_train, y_train)

# 输出系数和截距
print("系数向量:", sgd_reg.coef_)
print("截距向量:", sgd_reg.intercept_)

作业2 选择合适坐标(特征化)

关键的库

1
2
3
4
5
from sklearn.preprocessing import StandardScaler 
# 准备特征化数据
scaler = StandardScaler()
x_scaler=scaler.fit_transform(x)
# 在训练数据上调用 fit 方法,StandardScaler 会计算每个特征的均值和标准差,并存储这些值,以便稍后在相同的数据集或其他数据集(如测试集)上进行转换。
最终代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDRegressor
import pandas as pd
from sklearn.preprocessing import StandardScaler
# 加载数据
data_path='data/week2ex2data.txt'
data = pd.read_csv(data_path)

features = ['space','number']
target = ['price']
# 分离特征和目标变量
x = data[features]
y = data[target].values.ravel() # ravel将DataFrame转换为一维数组
# 准备特征化数据
scaler = StandardScaler()
x_scaler=scaler.fit_transform(x)
# 划分训练集和测试集
x_train, x_test, y_train, y_test = train_test_split(x_scaler, y, test_size=0.2,random_state=42)
# 实例化并拟合模型
sgd_reg = SGDRegressor(max_iter=1000, tol=1e-3)
sgd_reg.fit(x_train, y_train)

# 输出系数和截距
print("系数向量:", sgd_reg.coef_)
print("截距向量:", sgd_reg.intercept_)


ng-多变量回归
http://example.com/2024/04/08/机器学习/ng机器学习week2/
作者
bradin
发布于
2024年4月8日
许可协议