是否有一种方法可以在AWS SageMaker中将自定义回归指标包含在ModelQualityMonitor中?
我已经成功初始化了一个ModelQualityMonitor对象。 然后,我使用CreateMonitoringsChedule API创建了监视时间表!在背景中,SageMaker经营两个处理作业,将地面真相数据与收集的端点数据合并,然后分析并创建预定义的回归指标: https://docs.aws。 amazon.com/sagemaker/latest/dg/model-monitor-model-quality-metrics.html
不幸的是,我缺少指标中的Mape(平均绝对百分比错误)未来(也在CloudWatch中)。
SageMaker提供以下功能:
- 预处理和后处理: 除了使用内置机制外,您还可以使用预处理和后处理脚本扩展代码。
- 带上自己的容器: Amazon SageMaker Model Monitor提供了一个预先构建的容器,具有分析从端点捕获的数据集捕获的数据。如果您想带上自己的容器,模型监视器提供了可以利用的扩展点。
- 带您自己的容器的CloudWatch指标
在此站点上记录了有关 : https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-custom-custom-monitoring-schedules.html
与上述点?
这是我当前实施的代码段:
from sagemaker.model_monitor.model_monitoring import ModelQualityMonitor
from sagemaker.model_monitor import EndpointInput
from sagemaker.model_monitor.dataset_format import DatasetFormat
# Create the model quality monitoring object
MQM = ModelQualityMonitor(
role=role,
instance_count=1,
instance_type="ml.m5.large",
volume_size_in_gb=20,
max_runtime_in_seconds=1800,
sagemaker_session=sagemaker_session,
)
# suggest a baseline
job = MQM.suggest_baseline(
job_name=baseline_job_name,
baseline_dataset="./baseline.csv",
dataset_format=DatasetFormat.csv(header=True),
output_s3_uri=baseline_results_uri,
problem_type="Regression",
inference_attribute="predicted_price",
ground_truth_attribute="price",
)
job.wait(logs=False)
baseline_job = MQM.latest_baselining_job
# create a monitoring schedule
endpointInput = EndpointInput(
endpoint_name="dev-TestEndpoint",
destination="/opt/ml/processing/input_data",
inference_attribute="$.data.predicted_price"
)
MQM.create_monitoring_schedule(
monitor_schedule_name="DS-Schedule",
endpoint_input=endpointInput,
output_s3_uri=baseline_results_uri,
constraints=baseline_job.suggested_constraints(),
problem_type="Regression",
ground_truth_input=ground_truth_upload_path,
schedule_cron_expression="cron(0 * ? * * *)", # hourly
enable_cloudwatch_metrics=True
)
I have successfully initialized a ModelQualityMonitor object.
Then I created a monitoring schedule using the CreateMonitoringSchedule API! In the background sagemaker runs two processing jobs which merges the ground truth data with the collected endpoint data and then analyzes and creates the predefined regression metrics:
https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-metrics.html
Unfortunately, I am missing the MAPE (Mean Absolute Percentage Error) in the metrics, and would like to create this with in the future (also in CloudWatch).
Sagemaker provides the following functionalities:
- Preprocessing and Postprocessing:
In addition to using the built-in mechanisms, you can extend the code with the preprocessing and postprocessing scripts. - Bring Your Own Containers:
Amazon SageMaker Model Monitor provides a prebuilt container with ability to analyze the data captured from endpoints for tabular datasets. If you would like to bring your own container, Model Monitor provides extension points which you can leverage. - CloudWatch Metrics for Bring Your Own Containers
Those points are documented on this site: https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-custom-monitoring-schedules.html
How exactly can I achieve my target of including MAPE with the above points?
Here is a code snippet of my current implementation:
from sagemaker.model_monitor.model_monitoring import ModelQualityMonitor
from sagemaker.model_monitor import EndpointInput
from sagemaker.model_monitor.dataset_format import DatasetFormat
# Create the model quality monitoring object
MQM = ModelQualityMonitor(
role=role,
instance_count=1,
instance_type="ml.m5.large",
volume_size_in_gb=20,
max_runtime_in_seconds=1800,
sagemaker_session=sagemaker_session,
)
# suggest a baseline
job = MQM.suggest_baseline(
job_name=baseline_job_name,
baseline_dataset="./baseline.csv",
dataset_format=DatasetFormat.csv(header=True),
output_s3_uri=baseline_results_uri,
problem_type="Regression",
inference_attribute="predicted_price",
ground_truth_attribute="price",
)
job.wait(logs=False)
baseline_job = MQM.latest_baselining_job
# create a monitoring schedule
endpointInput = EndpointInput(
endpoint_name="dev-TestEndpoint",
destination="/opt/ml/processing/input_data",
inference_attribute="$.data.predicted_price"
)
MQM.create_monitoring_schedule(
monitor_schedule_name="DS-Schedule",
endpoint_input=endpointInput,
output_s3_uri=baseline_results_uri,
constraints=baseline_job.suggested_constraints(),
problem_type="Regression",
ground_truth_input=ground_truth_upload_path,
schedule_cron_expression="cron(0 * ? * * *)", # hourly
enable_cloudwatch_metrics=True
)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
Amazon SageMaker Model Monitor仅支持定义定义的指标这里开箱即用。
如果您需要在您的情况下包含其他指标(平均绝对百分比错误),则必须依靠BYOC方法,请注意,使用此方法,您不能将“添加”度量指标添加到可用列表中,不幸的是您将拥有自己实施整个指标套件。我了解这对客户来说不是理想的选择,我鼓励您与您的AWS客户经理联系,以创建一个请求,以添加MAPE(平均绝对百分比错误),从长远来看。我也记下了这一点,并将其依靠回到团队中。
同时,您可以找到有关如何byoc 在这里。
我为AWS工作,但我的意见是我自己的。
谢谢,
拉古
Amazon SageMaker model monitor only supports metrics that are defined here out of the box.
If you need to include another metric such as MAPE (Mean Absolute Percentage Error) in your case, you will have to rely on BYOC approach, note that with this approach you cannot "add" a metric to the available list, unfortunately you will have to implement the entire suite of metrics yourself. I understand this is not ideal for customers, I'd encourage you to reach out to your AWS account manager to create a request to add MAPE (Mean Absolute Percentage Error) as a supported metric in the long run. I've made a note of it as well and will rely it back to the team.
In the meantime, you can find examples on how to BYOC here.
I work for AWS but my opinions are my own.
Thanks,
Raghu