训练和加载汽车模型之间的不同结果

发布于 2025-01-24 10:44:07 字数 1898 浏览 0 评论 0原文

我训练了带有汽车的回归模型，导致一个模型，其MAE为0.2，该代码为输入和输出数据：

X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=1)
search = StructuredDataRegressor(max_trials=1000, loss='mean_squared_error', max_model_size=100000000000, overwrite = True)
search.fit(x=X_train, y=y_train, verbose=2, validation_data=(X_test, y_test))
model = search.export_model()
model.summary()
model.save('model_best')

将我的数据添加到模型中，可提供约30个MAE，并提供了相当胡说的预测。我的测试输出值在3到10的范围内，预测的输出值在-10至5范围内。

model = load_model("model_best2", custom_objects=ak.CUSTOM_OBJECTS)
mae, _ = model.evaluate(x, y, verbose=2)
print('MAE: %.3f' % mae)

这些结果可与Autokeras的任何提供的模型一起重现。您是否有任何线索，为什么培训和评估结果完全不同？

我创建了一个最小的示例，该示例正在提供类似的不良结果，因此您可以自己尝试：

from numpy import asarray
from pandas import read_csv
from sklearn.model_selection import train_test_split
from autokeras import StructuredDataRegressor
import matplotlib.pyplot as plt
# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/auto-insurance.csv'
dataframe = read_csv(url, header=None)
data = dataframe.values
data = data.astype('float32')
X, y = data[:, :-1], data[:, -1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)

search = StructuredDataRegressor(max_trials=15, loss='mean_absolute_error')
search.fit(x=X_train, y=y_train, verbose=2)

mae, _ = search.evaluate(X_test, y_test, verbose=2)
print('MAE: %.3f' % mae)

predictions = search.predict(X)

miny = float(y.min())
maxy = float(y.max())
minp = float(min(predictions))
maxp = float(max(predictions))
plt.figure(figsize=(15,15))
plt.scatter(y, predictions, c='crimson',s=5)
p1 = max(maxp, maxy)
p2 = min(minp, miny)
plt.plot([p1, 0], [p1, 0], 'b-')
plt.xlabel('True Values', fontsize=15)
plt.ylabel('Predictions', fontsize=15)
plt.axis('equal')
plt.show()

原文

I trained a regression-model with autokeras resulting in a model with a MAE of 0.2 with that code, where x and y were input and output-dataframes:

X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=1)
search = StructuredDataRegressor(max_trials=1000, loss='mean_squared_error', max_model_size=100000000000, overwrite = True)
search.fit(x=X_train, y=y_train, verbose=2, validation_data=(X_test, y_test))
model = search.export_model()
model.summary()
model.save('model_best')

Refeeding my data to the model delivers a MAE of about 30 with pretty nonsense predictions. My test-output values are in the range of 3 to 10, predicted output-values are in the range of -10 to 5.

model = load_model("model_best2", custom_objects=ak.CUSTOM_OBJECTS)
mae, _ = model.evaluate(x, y, verbose=2)
print('MAE: %.3f' % mae)

Those results are reproducible with any provided model from autokeras. Do you have any clue why training and evaluation results are totally different?

I created a minimal example which is delivering similar bad results so you can try on your own:

from numpy import asarray
from pandas import read_csv
from sklearn.model_selection import train_test_split
from autokeras import StructuredDataRegressor
import matplotlib.pyplot as plt
# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/auto-insurance.csv'
dataframe = read_csv(url, header=None)
data = dataframe.values
data = data.astype('float32')
X, y = data[:, :-1], data[:, -1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)

search = StructuredDataRegressor(max_trials=15, loss='mean_absolute_error')
search.fit(x=X_train, y=y_train, verbose=2)

mae, _ = search.evaluate(X_test, y_test, verbose=2)
print('MAE: %.3f' % mae)

predictions = search.predict(X)

miny = float(y.min())
maxy = float(y.max())
minp = float(min(predictions))
maxp = float(max(predictions))
plt.figure(figsize=(15,15))
plt.scatter(y, predictions, c='crimson',s=5)
p1 = max(maxp, maxy)
p2 = min(minp, miny)
plt.plot([p1, 0], [p1, 0], 'b-')
plt.xlabel('True Values', fontsize=15)
plt.ylabel('Predictions', fontsize=15)
plt.axis('equal')
plt.show()

分享到QQ

分享到微博