我可以保存ALS模型

发布于 2025-02-05 05:17:29 字数 1432 浏览 2 评论 0原文

from pyspark.ml.recommendation import ALS, ALSModel
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator
from pyspark.mllib.evaluation import RegressionMetrics, RankingMetrics
from pyspark.ml.evaluation import RegressionEvaluator

als = ALS(maxIter=15, 
              regParam=0.08, 
              userCol="ID User", 
              itemCol="ID Film", 
              ratingCol="Rating",
              rank=20,
              numItemBlocks=30,
              numUserBlocks = 30,
              alpha = 0.95,
              nonnegative = True, 
              coldStartStrategy="drop",
             implicitPrefs=False)
model = als.fit(training_dataset)

model.save('model')

每次我调用保存方法时,Jupyter笔记本都会给我类似的错误,

An error occurred while calling o477.save.
: org.apache.spark.SparkException: Job aborted.
    at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:106)

我知道以前的问题和回答,并尝试了以下问题:

model.save('model')

model.write().save("saved_model")

als.write().save("saved_model")

als.save('model')

import pickle
s = pickle.dumps(als)

als_path = "from_C:Folder_to_my_project_root" + "/als"
als.save(als_path)

我的问题是如何保存ALS模型,以便每次运行程序时都可以加载它而无需训练

from pyspark.ml.recommendation import ALS, ALSModel
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator
from pyspark.mllib.evaluation import RegressionMetrics, RankingMetrics
from pyspark.ml.evaluation import RegressionEvaluator

als = ALS(maxIter=15, 
              regParam=0.08, 
              userCol="ID User", 
              itemCol="ID Film", 
              ratingCol="Rating",
              rank=20,
              numItemBlocks=30,
              numUserBlocks = 30,
              alpha = 0.95,
              nonnegative = True, 
              coldStartStrategy="drop",
             implicitPrefs=False)
model = als.fit(training_dataset)

model.save('model')

everytime i call save method the jupyter notebook gives me similar error

An error occurred while calling o477.save.
: org.apache.spark.SparkException: Job aborted.
    at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:106)

I'm aware of the previous SO question and answer and has tried this:

model.save('model')

.

model.write().save("saved_model")

.

als.write().save("saved_model")

.

als.save('model')

.

import pickle
s = pickle.dumps(als)

.

als_path = "from_C:Folder_to_my_project_root" + "/als"
als.save(als_path)

my question is how to save ALS model so that i can load it without training everytime i run the program

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

瀟灑尐姊 2025-02-12 05:17:29

我曾经经常运行这个问题,在其中为Netflix奖品数据集提供了建议,总共有1亿个记录。这就是我所做的,尝试运行50%的数据并慢慢添加百分比并查看其破裂的位置。在我的情况下,数据慢慢累加了100%的数据。关闭未经内心的镀铬选项卡也有帮助

I used to run this problem where i run recommendation for netflix prize dataset with total 100 million records. This is what i did, try to run 50% of the data and slowly add the percentage and see where it breaks. In my case the data slowly add up to 100% of the data. Closing unnecesarry Chrome tab also helps

萌酱 2025-02-12 05:17:29

基本上,O477和OXXX错误通常意味着在完成工作时存在错误。由于似乎您正在做电影推荐,因此我假设您使用Movielens或Netflix数据集。
这可能意味着其中之一:

  1. 文件太大,不能泡菜
  2. 该模型太复杂,并且您的内存耗尽

Basically, o477 and oXXX error in general means there's error while doing the jobs. since it seems you're doing a movie recommendation, i assume you use movielens or netflix dataset.
it can mean one of these:

  1. File is too big and can't pickle
  2. The model is too complex and your memory runs out
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文