为什么当存在以角色作为 ID 的变量时,使用香根草部署 tidymodel 会引发错误?
当模型包含一个角色为配方中 ID 的变量时,我无法使用香根草部署 tidymodel 并获得预测。请参阅图像中的以下错误:
{ "error": "500 - 内部服务器错误", "message": "错误:缺少以下必需列:'Fake_ID'。\n" 虚拟示例的代码
如下。 我是否需要从模型和配方中删除 ID 变量才能使 Plumber API 正常工作?
#Load libraries
library(recipes)
library(parsnip)
library(workflows)
library(pins)
library(plumber)
library(stringi)
#Upload data
data(Sacramento, package = "modeldata")
#Create fake IDs for testing
Sacramento$Fake_ID <- stri_rand_strings(nrow(Sacramento), 10)
# Train model
Sacramento_recipe <- recipe(formula = price ~ type + sqft + beds + baths + zip + Fake_ID, data = Sacramento) %>%
update_role(Fake_ID, new_role = "ID") %>%
step_zv(all_predictors())
rf_spec <- rand_forest(mode = "regression") %>% set_engine("ranger")
rf_fit <-
workflow() %>%
add_model(rf_spec) %>%
add_recipe(Sacramento_recipe) %>%
fit(Sacramento)
# Create vetiver object
v <- vetiver::vetiver_model(rf_fit, "sacramento_rf")
v
# Allow for model versioning and sharing
model_board <- board_temp()
model_board %>% vetiver_pin_write(v)
# Deploying model
pr() %>%
vetiver_api(v) %>%
pr_run(port = 8088)
I'm unable to deploy a tidymodel with vetiver and get a prediction when the model includes a variable with role as ID in the recipe. See the following error in the image:
{
"error": "500 - Internal server error",
"message": "Error: The following required columns are missing: 'Fake_ID'.\n"
}
The code for the dummy example is below.
Do I need to remove the ID-variable from both the model and recipe to make the Plumber API work?
#Load libraries
library(recipes)
library(parsnip)
library(workflows)
library(pins)
library(plumber)
library(stringi)
#Upload data
data(Sacramento, package = "modeldata")
#Create fake IDs for testing
Sacramento$Fake_ID <- stri_rand_strings(nrow(Sacramento), 10)
# Train model
Sacramento_recipe <- recipe(formula = price ~ type + sqft + beds + baths + zip + Fake_ID, data = Sacramento) %>%
update_role(Fake_ID, new_role = "ID") %>%
step_zv(all_predictors())
rf_spec <- rand_forest(mode = "regression") %>% set_engine("ranger")
rf_fit <-
workflow() %>%
add_model(rf_spec) %>%
add_recipe(Sacramento_recipe) %>%
fit(Sacramento)
# Create vetiver object
v <- vetiver::vetiver_model(rf_fit, "sacramento_rf")
v
# Allow for model versioning and sharing
model_board <- board_temp()
model_board %>% vetiver_pin_write(v)
# Deploying model
pr() %>%
vetiver_api(v) %>%
pr_run(port = 8088)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
截至今天,香根草寻找“模具”
workflows::extract_mold(rf_fit)
并且只取出预测变量来创建 ptype。但是,当您从工作流程中进行预测时,它确实需要所有变量,包括非预测变量。如果您已经使用非预测变量训练了模型,那么从今天开始,您可以通过传入自定义ptype
来使 API 工作:创建于 2022 年 3 月 10 日,由 reprex 包 (v2.0.1)
您是否正在使用非预测变量训练生产模型?您介意在 GitHub 上提出问题来进一步解释您的用例吗?
As of today, vetiver looks for the "mold"
workflows::extract_mold(rf_fit)
and only get the predictors out to create the ptype. But then when you predict from a workflow, it does require all the variables, including non-predictors. If you have trained a model with non-predictors, as of today you can make the API work by passing in a customptype
:Created on 2022-03-10 by the reprex package (v2.0.1)
Are you training models for production with non-predictor variables? Would you mind opening an issue on GitHub to explain your use case a little more?