我试图将2D数据点投影到数据协方差矩阵的顶部主成分上,但我的投影点并不在预期的向量上。下面是我的代码和输出。我看了问的其他问题,但仍然不能确定可能是什么问题。谁能告诉我我做错了什么或找出错误之处?
# generate random 2D data (2 neurons, 40 data points each)
t = 40
r1 = [i for i in range(40)]
r2 = [i+np.random.randint(-20,20) for i in range(40)]
data = np.array([r1,r2]) / 10
# SVD
data_cov = data @ data.T # covariance matrix (<x,x.T>)
u, s, v = np.linalg.svd(data_cov, full_matrices=True) # svd
dir_max_var = v[:,0] # direction of maximal variance
# project data
data_proj = np.ndarray(shape=(2,t))
v_norm = np.sqrt(sum(dir_max_var**2))
for i in range(t):
data_proj[:,i] = ((data[:,i] @ dir_max_var)/v_norm**2)*dir_max_var
# # project data - vectorized (gives same output as the for loop version)
# data_T = data.T # (40 x 2) * (2 x 1) * (1 x 2) =
# data_proj = data_T @ dir_max_var.reshape(1,2).T @ dir_max_var.reshape(1,2)
# data_proj = data_proj.T
# plot
fig = plt.figure()
ax = fig.add_subplot()
ax.scatter(data[0,:],data[1,:], s=75, c='gray', edgecolors='w')
ax.scatter(data_proj[0,:],data_proj[1,:], s=100,alpha=0.2,c='red', edgecolors='w')
ax.set_xlabel('$r_1$')
ax.set_ylabel('$r_2$')
ax.quiver(np.mean(data[0,:]),np.mean(data[1,:]),dir_max_var[0],dir_max_var[1], \
color='green',scale=1, alpha = 0.6)
ax.quiver(np.mean(data[0,:]),np.mean(data[1,:]),-dir_max_var[0],-dir_max_var[1], \
color='green',scale=1, alpha = 0.6)
plt.show()
输出:
发布于 2020-04-15 15:13:41
原来所有单独的地块的轴是不同的。使用plt.axis('equal')
可解决此问题。
https://stackoverflow.com/questions/61219444
复制