如何绘制Python中多项式逻辑回归的决策边界?
我查看了此网站上的示例: https://scipython.com/blog/plotting-the-decision-boundary-of-a-logistic-regression-model/
我了解他们如何绘制线性特征向量的决策边界。 我将如何绘制决策边界呢
from sklearn.preprocessing import PolynomialFeatures
...
poly = PolynomialFeatures(degree = 3, interaction_only=False, include_bias=False)
X_poly = poly.fit_transform(X)
# Fit the data to a logistic regression model.
clf = sklearn.linear_model.LogisticRegression()
clf.fit(X_poly, Y)
但是,如果我申请获得弯曲的决策边界, ? (我知道这对于网站上的示例没有多大意义,但谈论它可能更容易)。
我尝试通过叠加多项式图来绘制生成的多项式决策边界,但只得到如下奇怪的结果:
那么我怎样才能制作一个弯曲的决策边界图呢?
编辑后的代码:
from sklearn.preprocessing import PolynomialFeatures
import numpy as np
import matplotlib.pyplot as plt
import sklearn.linear_model
plt.rc('text', usetex=True)
plt.figure(dpi=1200)
pts = np.loadtxt(r'C:\Users\stefa\OneDrive\Desktop\linpts.txt')
X = pts[:,:2]
Y = pts[:,2].astype('int')
poly = PolynomialFeatures(degree = 2, interaction_only=False, include_bias=False)
X_poly = poly.fit_transform(X)
# Fit the data to a logistic regression model.
clf = sklearn.linear_model.LogisticRegression()
clf.fit(X_poly, Y)
# Retrieve the model parameters.
b = clf.intercept_[0]
w1, w2,w3,w4,w5 = clf.coef_.T
# In[]
def PolyCoefficients(x, coeffs):
""" Returns a polynomial for ``x`` values for the ``coeffs`` provided.
The coefficients must be in ascending order (``x**0`` to ``x**o``).
"""
o = len(coeffs)
print(f'# This is a polynomial of order {ord}.')
y = 0
for i in range(o):
y += coeffs[i]*x**i
return y
x = np.linspace(0, 9, 100)
coeffs = [b, w1, w2, w3, w4, w5]
plt.plot(x, PolyCoefficients(x, coeffs))
plt.show()
# In[]
# Calculate the intercept and gradient of the decision boundary.
c = -b/w2
m = -w1/w2
# Plot the data and the classification with the decision boundary.
xmin, xmax = -1, 2
ymin, ymax = -1, 2.5
xd = np.array([xmin, xmax])
yd = m*xd + c
#plt.plot(xd, yd, 'k', lw=1, ls='--')
plt.plot(x, PolyCoefficients(x, coeffs))
plt.fill_between(xd, yd, ymin, color='tab:blue', alpha=0.2)
plt.fill_between(xd, yd, ymax, color='tab:orange', alpha=0.2)
plt.scatter(*X[Y==0].T, s=8, alpha=0.5)
plt.scatter(*X[Y==1].T, s=8, alpha=0.5)
plt.xlim(xmin, xmax)
plt.ylim(ymin, ymax)
plt.ylabel(r'$x_2$')
plt.xlabel(r'$x_1$')
plt.show()
I have looked into the example on this website: https://scipython.com/blog/plotting-the-decision-boundary-of-a-logistic-regression-model/
I understand how they plot the decision boundary for a linear feature vector. But how would I plot the decision boundary if I apply
from sklearn.preprocessing import PolynomialFeatures
...
poly = PolynomialFeatures(degree = 3, interaction_only=False, include_bias=False)
X_poly = poly.fit_transform(X)
# Fit the data to a logistic regression model.
clf = sklearn.linear_model.LogisticRegression()
clf.fit(X_poly, Y)
to get a curved decision boundary? (I know it doesnt make a lot of sense for the example on the webiste, but it may be easier to talk about it).
I have tried to plot the resulting polynomial decision boundary by overlaying the polynomial plot but only got weird results like this:
So how could I do a curved decision boundary plot?
the edited code:
from sklearn.preprocessing import PolynomialFeatures
import numpy as np
import matplotlib.pyplot as plt
import sklearn.linear_model
plt.rc('text', usetex=True)
plt.figure(dpi=1200)
pts = np.loadtxt(r'C:\Users\stefa\OneDrive\Desktop\linpts.txt')
X = pts[:,:2]
Y = pts[:,2].astype('int')
poly = PolynomialFeatures(degree = 2, interaction_only=False, include_bias=False)
X_poly = poly.fit_transform(X)
# Fit the data to a logistic regression model.
clf = sklearn.linear_model.LogisticRegression()
clf.fit(X_poly, Y)
# Retrieve the model parameters.
b = clf.intercept_[0]
w1, w2,w3,w4,w5 = clf.coef_.T
# In[]
def PolyCoefficients(x, coeffs):
""" Returns a polynomial for ``x`` values for the ``coeffs`` provided.
The coefficients must be in ascending order (``x**0`` to ``x**o``).
"""
o = len(coeffs)
print(f'# This is a polynomial of order {ord}.')
y = 0
for i in range(o):
y += coeffs[i]*x**i
return y
x = np.linspace(0, 9, 100)
coeffs = [b, w1, w2, w3, w4, w5]
plt.plot(x, PolyCoefficients(x, coeffs))
plt.show()
# In[]
# Calculate the intercept and gradient of the decision boundary.
c = -b/w2
m = -w1/w2
# Plot the data and the classification with the decision boundary.
xmin, xmax = -1, 2
ymin, ymax = -1, 2.5
xd = np.array([xmin, xmax])
yd = m*xd + c
#plt.plot(xd, yd, 'k', lw=1, ls='--')
plt.plot(x, PolyCoefficients(x, coeffs))
plt.fill_between(xd, yd, ymin, color='tab:blue', alpha=0.2)
plt.fill_between(xd, yd, ymax, color='tab:orange', alpha=0.2)
plt.scatter(*X[Y==0].T, s=8, alpha=0.5)
plt.scatter(*X[Y==1].T, s=8, alpha=0.5)
plt.xlim(xmin, xmax)
plt.ylim(ymin, ymax)
plt.ylabel(r'$x_2
)
plt.xlabel(r'$x_1
)
plt.show()
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
让我生成一个演示数据。
除了随机性之外,数据看起来像这样。
像您一样训练模型。
这告诉我们特征的排序为:
x1, x2, x1^2, x1*x2, x2^2
。因此,收集系数和截距并给它们提供直观的名称。
根据定义,决策边界是一组 (x1, x2),使得两个类之间的概率均匀。从数学上讲,它们是以下问题的解:
如果我们固定
x1
,那么这是一个x2
的二次方程,我们可以解析求解。下面的函数可以完成这项工作。现在我们有了数据边界,我们可以轻松地将它们可视化。
请注意,由于模型是二次模型,因此可以显式计算边界。更通用、更复杂的分类器需要不同的方法。
一种更简单、普遍适用的方法是创建包含各种变量组合的虚拟数据,让分类器进行预测,并使用预测类别给出的颜色进行绘图。
Let me generate a demo data.
Apart from the randomness, the data looks something like this.
Train the model like you did.
This tells us that the features are ordered as:
x1, x2, x1^2, x1*x2, x2^2
.So collect the coefficients and the intercept and give them intuitive names.
By definition, the decision boundary is a set of (x1, x2) such that the probability is even between the two classes. Mathematically, they are the solutions to:
If we fix
x1
, then this is a quadratic equation ofx2
, which we can solve analytically. The following function does this job.Now we have the boundaries as data, we can visualize them easily.
Notice that the boundaries can be explicitly computed because the model is quadratic. Different approaches are needed for more general, complex classifiers.
An easier, generally applicable approach is to create dummy data containing various combination of variables and let the classifier predict, and plot with the color given by the predicted class.