ovr是什么意思,ovr

OvR

在二分类逻辑回归的基础上，采用One vs Rest的方法进行多分类。最终分类由概率最高的OvR决定。在二分类实验中，会发现对versicolor的分类准确性和覆盖率很差，对virginica的分类也不是很理想，只有对setosa的分类很准。采用OvR综合决策后，分类准确率极大提升，正确率达到95%左右。

# Author: frdsb# In this code, let’s change the method to Gradient descent. We start from single feature.import csvimport numpy as npimport matplotlib.pyplot as plimport randomimport scipy.special# Import Iris raw datadef ImportData(filePath): X = [] # 2D arrays [[1, x11, x12, …], [], …[]], each item is a [] which contains columns of each row Y = [] # 1D vector Stands for class f = open(filePath) r = csv.reader(f, delimiter=’,’) r.__next__() # Skip header row for row in r: rowX = [1.0] for i in row[1:5]: rowX.append(float(i)) X.append(rowX) Y.append(row[5]) return (X, Y)# Random the order of the raw datadef RandomShuffle(X, Y, flag): # random the order of raw data if flag == True: for i, j in zip(X, Y): i.append(j) random.shuffle(X) Y = [] # clear Y, get Y item from X for i in X: Y.append(i[-1]) i.pop(-1) else: pass return (X, Y)# Format to psdmj(X): Xformat = np.array(X[:]) # change to numpy format return Xformat# Refill classification, change string to 1 or 0def RefillClass(Y, char_positive): tempY = [] for i, e in enumerate(Y): if Y[i] == char_positive: tempY.append(1.0) else: tempY.append(0.0) return tempY# Normalization to keep all the features in the same scaledef Normalization(X): for i in range(5): if i != 0: X[:, i] = (X[:, i] – X[:, i].mean()) / (X[:, i].max() – X[:, i].min()) return X# Define function to calculate Hthetadef Htheta(Theta, Xi): z = np.dot(Theta, Xi) result = scipy.special.expit(z) return result# Define gradient descent function to calculate new dydwx, X, Y, m, n, alpha): tempTheta = np.zeros((1, n)).reshape(n, ) for i in range(n): gradient = 0.0 for j in range(m): gradient += (Htheta(Theta, X[j]) – Y[j]) * X[jqxdjri] gradient *= alpha tempdtdsni] = gradient # update theta value one time for i in range(n): dtdsni] -= tempdtdsni] return Theta# Define function to calculate costdef 会撒娇的冬瓜, X, Y, m): error = 0 for i in range(m): calResult = Htheta(Theta, X[i]) error += Y[i] * np.log(calResult) + (1 – Y[i]) * np.log(1 – calResult) error = 0 – error / m return error# Define a function to evaluate accuracy & coveragedef EvaluateFunc(Prediction, Sample): count_prediction1 = 0 count_sample1 = 0 error1 = 0 error2 = 0 for i, j in zip(Prediction, Sample): if i == 1: count_prediction1 += 1 if j == 0: error1 += 1 if j == 1: count_sample1 += 1 if i == 0: error2 += 1 accuracy = 1 – error1 / count_prediction1 coverage = 1 – error2 / count_sample1 return accuracy, coverage# Step 1, initialization## Step 1.1 Import raw data and random shuffle the orderXvalue, Yvalue = ImportData(r”C:\Users\64134\PycharmProjects\pythonProject\LogisticRegression\Data\iris.csv”)Xvalue, Yvalue = RandomShuffle(Xvalue, Yvalue, True)## Step 1.2 Format to NumpyX = FormatNumpy(Xvalue)Y = FormatNumpy(Yvalue)## Step 1.3 Normalization to keep all features in the same scaleX = Normalization(X)## Step 1.4 Initiate parametersstep = 10000 # Number of iteration stepsn = 5 # Number of features, including feature 0Theta = np.ones((3, n)) #Define theta, where theta is a vector(numpy array): Theta = [[Theta0], [Theta1], [Theta2]]; n is the number of feature.cost = np.ones((3, 1))alpha = 0.001costRecord = np.zeros((3, step)) # Record cost for each step for monitoringThetaRecord = np.zeros((3, step, n)) # Record each theta for each step for monitoringlenth = len(Yvalue) # Use it to only fetch part of data, leave 10 rows for test thetatestLength = 20m = lenth – testLength # m is the number of samples for trainingirisClasses = [‘setosa’, ‘versicolor’, ‘virginica’]# Step 2 minimize costfor num in range(3): for i in range(step): Yfill01 = Y Yfill01 = RefillClass(Yfill01, irisClassesxsdjm) costxsdjm = CostFunc(dtdsnnum], X[:lenth – testLength], Yfill01[:lenth – testLength], m) # current cost costRecordxsdjm[i] = costxsdjm # record current cost dtdsnnum] = GradientDescent(dtdsnnum], X[:lenth – testLength], Yfill01[:lenth – testLength], m, n, alpha) ThetaRecordxsdjm[i] = dtdsnnum] print(‘num = ‘ + ‘%.2f’ %num) print(‘Final Theta’ + ‘%s’ %num + ‘ = ‘) # show final result of Theta print(ThetaRecordxsdjm[i]) print(‘Final cost’ + ‘%s’ %num + ‘ = ‘ + ‘%.2f’ %costxsdjm) # show final result of cost# Step 3 testXtest = X[lenth – testLength:背后的台灯 = Y[lenth – testLength:爱撒娇的可乐 = np.zeros((3, testLength))czdyf = []for num in range(3): Ycal = np.dot(Xtest, dtdsnnum]) for i in range(testLength): Yresultxsdjm[i] = scipy.special.expit(调皮的大炮i]) print(‘Class’ + ‘%s’ %str(num) + ‘= ‘) print(Yresultxsdjm)for i in range(testLength): max = 0 num = 0 for j in range(3): if Yresult[jqxdjri] > max: max = Yresult[jqxdjri] num = j czdyf.append(num)print(Yraw)print(czdyf)# Step 4 evaluation# define parameters to evaluate algorithm# accuracy – for all that predicts to be 1, if there is any sample that is actually 0# coverage – if all 1 samples are predicted correctly#accuracy, coverage = EvaluateFunc(Ycal_01, Yraw)#print(‘accuracy = ‘ + ‘%.2f%%’ % (accuracy * 100))#print(‘coverage = ‘ + ‘%.2f%%’ % (coverage * 100))error = 0errorPosition = []for i in range(testLength): if czdyf[i] == 0: if Yraw[i] == ‘setosa’: pass else: error += 1 errorPosition.append(i) elif czdyf[i] == 1: if Yraw[i] == ‘versicolor’: pass else: error += 1 errorPosition.append(i) elif czdyf[i] == 2: if Yraw[i] == ‘virginica’: pass else: error += 1 errorPosition.append(i)accuracy = 1 – error / testLengthprint(‘accuracy = ‘ + ‘%.2f%%’ % (accuracy * 100))print(errorPosition)# Step 5 plot#fig, ax1 = pl.subplots() # Generate figure object and axes#ax1.set(xlim = [1, step], ylim = [0, max(costRecord)], title = ‘cost vs step’) # This sub-figure is to show the convergence of cost#ax1.plot(list(range(step)), costRecord)#pl.show()

ovr是什么意思,ovr

armv7指令集(armv7)

pe文件格式详解,pe怎么占格式

最新文章

无限流量卡 4g 价格低无限流量卡最便宜(移动哪个无限流量卡比较好)

无限不限量流量卡价格真正无限流量卡不限速不限量(真的有真正的不限速不限量流量卡吗)

无限gprs流量卡价格 gprs无限流量卡多少钱(无限流量卡多少钱一个月)

无锡移动大流量卡套餐价格无锡移动大流量卡套餐价格表(2024年便宜好用的大流量卡套餐)

无锡电信卡流量套餐价格无锡电信流量卡套餐介绍(江苏无锡电信卡)

无锡流量卡移动套餐价格表无锡流量卡移动套餐价格表查询(无锡移动套餐价格表2023)

无锡流量卡价格无锡联通流量卡(无锡联通流量卡是真的吗)

无锡手表流量卡价格无锡移动流量卡(97件无锡移动流量卡)

无锡大流量卡套餐价格表 2023无锡移动流量活动(2024年便宜好用的大流量卡套餐)

无锡5g移动流量卡价格中国移动5g无限量流量卡(中国移动流量不限量可选套餐)

标签

热评文章

报告：电视显示面板价格全面上涨 85吋将达308美元

上海龙柱（揭秘上海延安高架神秘的龙柱事件）

求圆面积（圆形的面积怎么算？）

合格率怎么算（计算及格率的方法）

碑文怎么写（撰写墓碑“碑文”有哪些格式和要求）

ovr是什么意思,ovr

armv7指令集(armv7)

pe文件格式详解,pe怎么占格式

最新文章

无限流量卡 4g 价格低 无限流量卡最便宜(移动哪个无限流量卡比较好)

标签

热评文章

报告：电视显示面板价格全面上涨 85吋将达308美元

上海 龙柱（揭秘上海延安高架神秘的龙柱事件）

求圆面积（圆形的面积怎么算？）

合格率怎么算（计算及格率的方法）

碑文怎么写（撰写墓碑“碑文”有哪些格式和要求）

关注我们的公众号

无限流量卡 4g 价格低无限流量卡最便宜(移动哪个无限流量卡比较好)

上海龙柱（揭秘上海延安高架神秘的龙柱事件）