阅读前请看一下:我是一个热衷于记录的人,每次写博客会反复研读,尽量不断提升博客质量。文章设置为仅粉丝可见,是因为写博客确实花了不少精力。希望互相进步谢谢!!
文章目录
阅读前请看一下:我是一个热衷于记录的人,每次写博客会反复研读,尽量不断提升博客质量。文章设置为仅粉丝可见,是因为写博客确实花了不少精力。希望互相进步谢谢!!问题描述原因分析解决方案1解决方案2总结:问题描述
function:np.logical_and()函数
bug:机器学习时算混淆矩阵TP、FP、FN、TN时,想对传入的连个array里面先分布进行每个元素值的判断,再进行and操作,不报错但结果分别为0
code:
import numpy as nptrain_set_y= [0,1,0,1,1,1,0]#训练集样本标注train_predict_y = [0,1,0,1,1,1,1]#训练集预测标注TP = np.sum(np.logical_and(train_set_y==1, train_predict_y_class==1)) FP = np.sum(np.logical_and(train_set_y==0, train_predict_y_class==1))FN = np.sum(np.logical_and(train_set_y==1, train_predict_y_class==0))TN = np.sum(np.logical_and(train_set_y==0, train_predict_y_class==0))print(f'TP = {TP}, FP = {FP}, TN = {TN}, FN = {FN}')
output:
TP = 0, FP = 0, TN = 0, FN = 0
很明显结果不对,这里明显结果应该为:
TP = 4, FP = 1, TN = 2, FN = 0
原因分析
打印调试后:
import numpy as nptrain_set_y= [0,1,0,1,1,1,0]#训练集样本标注train_predict_y = [0,1,0,1,1,1,1]#训练集预测标注TP = np.sum(np.logical_and(train_set_y==1, train_predict_y==1))print(np.logical_and(train_set_y==1, train_predict_y==1)) FP = np.sum(np.logical_and(train_set_y==0, train_predict_y==1))print(np.logical_and(train_set_y==0, train_predict_y==1)) FN = np.sum(np.logical_and(train_set_y==1, train_predict_y==0))print(np.logical_and(train_set_y==1, train_predict_y==0)) TN = np.sum(np.logical_and(train_set_y==0, train_predict_y==0))print(np.logical_and(train_set_y==0, train_predict_y==0)) print(f'TP = {TP}, FP = {FP}, TN = {TN}, FN = {FN}')
output:
FalseFalseFalseFalseTP = 0, FP = 0, TN = 0, FN = 0
分析:
这里每次np.logical_and()执行后竟然只有一个False; 问题就在这里:
因为train_set_y、train_predict_y类型都是list,自然没法==1。
盲猜是因为list != int
解决方案1
官网文档:
可以看到里面除了传list以外,还可以传ndarray
关于list 和 ndarray 分不清的,可以看之前我的另一篇:《Numpy及list与array对比》
解决办法:
不传list,传ndarray
故代码如下:
import numpy as nptrain_set_y= np.array([0,1,0,1,1,1,0])#训练集样本标注train_predict_y = np.array([0,1,0,1,1,1,1])#训练集预测标注TP = np.sum(np.logical_and(train_set_y==1, train_predict_y==1))print(np.logical_and(train_set_y==1, train_predict_y==1)) FP = np.sum(np.logical_and(train_set_y==0, train_predict_y==1))print(np.logical_and(train_set_y==0, train_predict_y==1)) FN = np.sum(np.logical_and(train_set_y==1, train_predict_y==0))print(np.logical_and(train_set_y==1, train_predict_y==0)) TN = np.sum(np.logical_and(train_set_y==0, train_predict_y==0))print(np.logical_and(train_set_y==0, train_predict_y==0)) print(f'TP = {TP}, FP = {FP}, TN = {TN}, FN = {FN}')
output:
[False True False True True True False][False False False False False False True][False False False False False False False][ True False True False False False False]TP = 4, FP = 1, TN = 2, FN = 0
可见结果争取了。那为何传ndarray就可以了呢?
还记得numpy的广播操作吗。盲猜这里一定是把==1中的1 广播成具有相同维度的ndarray,从而实现每个元素是否==1的判断
解决方案2
那非要传入list怎么办呢,那就不用numpy的函数,直接暴力for循环呗:
import numpy as nptrain_set_y= [0,1,0,1,1,1,0]#训练集样本标注train_predict_y = [0,1,0,1,1,1,1]#训练集预测标注TP = 0FP = 0FN = 0TN = 0for i,j in zip(train_set_y, train_predict_y): #这里传入的是list,当然传入ndarray也可以if i==1 and j==1:TP += 1elif i==0 and j==1:FP += 1elif i==1 and j==0:FN += 1elif i==0 and j==0:TN += 1print(f'TP = {TP}, FP = {FP}, TN = {TN}, FN = {FN}')
output:
TP = 4, FP = 1, TN = 2, FN = 0
总结:
1、np.logical_and()如果想对传入的连个array里面先分布进行每个元素值的判断,再进行and操作的话,必须要传入ndarray,否则结果错误;
2、for既可以传list,也可以传ndarray,因为两者都是可迭代的对象
—码字不易,谢谢点赞!!!
码字不易,谢谢点赞!!!
码字不易,谢谢点赞!!!