尝试使用 SageMaker Neo 编译 SageMaker 语义分割模型（内置算法）时出现 Missing -symbol.json 错误

发布于 2025-01-16 16:47:00 字数 1009 浏览 4 评论 0原文

我使用内置的 sagemaker 语义分割算法训练了 SageMaker 语义分割模型。这可以正常部署到 SageMaker 端点，我可以通过它在云中成功运行推理。我想在边缘设备（AWS Panorama Appliance）上使用该模型，这意味着使用 SageMaker Neo 编译模型以符合目标设备的规格。

但是，无论我的目标设备是什么（Neo 设置），我似乎都无法使用 Neo 编译模型，因为出现以下错误：

ClientError: InputConfiguration: No valid Mxnet model file -symbol.json found

语义分割模型的 model.tar.gz 包含 hyperparams.json、model_algo-1、 model_best.params。根据文档，model_algo-1 是序列化的 mxnet 模型。 Neo 不支持胶子模型吗？

顺便说一句，我在另一个 SageMaker 内置算法 k-最近邻 (k-NN) 中遇到了完全相同的问题。它似乎也是在没有 -symbol.json 的情况下编译的。

我可以运行一些脚本来重新创建 -symbol.json 文件或转换编译后的 sagemaker 模型吗？

使用 Estimator 构建模型后，我必须使用代码在 SageMaker Neo 中对其进行编译：

optimized_ic = my_estimator.compile_model(
 target_instance_family="ml_c5",
 target_platform_os="LINUX",
 target_platform_arch="ARM64",
 input_shape={"data": [1,3,512,512]},  
 output_path=s3_optimized_output_location,
 framework="mxnet",
 framework_version="1.8", 
)

我希望编译正常，但这就是我收到错误消息的地方，指出模型缺少 *-symbol.json 文件。

原文

I have trained a SageMaker semantic segmentation model, using the built-in sagemaker semantic segmentation algorithm. This deploys ok to a SageMaker endpoint and I can run inference in the cloud successfully from it.
I would like to use the model on a edge device (AWS Panorama Appliance) which should just mean compiling the model with SageMaker Neo to the specifications of the target device.

However, regardless of what my target device is (the Neo settings), I cant seem to compile the model with Neo as I get the following error:

ClientError: InputConfiguration: No valid Mxnet model file -symbol.json found

The model.tar.gz for semantic segmentation models contains hyperparams.json, model_algo-1, model_best.params. According to the docs, model_algo-1 is the serialized mxnet model. Aren't gluon models supported by Neo?

Incidentally I encountered the exact same problem with another SageMaker built-in algorithm, the k-Nearest Neighbour (k-NN). It too seems to be compiled without a -symbol.json.

Is there some scripts I can run to recreated a -symbol.json file or convert the compiled sagemaker model?

After building my model with an Estimator, I got to compile it in SageMaker Neo with code:

optimized_ic = my_estimator.compile_model(
 target_instance_family="ml_c5",
 target_platform_os="LINUX",
 target_platform_arch="ARM64",
 input_shape={"data": [1,3,512,512]},  
 output_path=s3_optimized_output_location,
 framework="mxnet",
 framework_version="1.8", 
)

I would expect this to compile ok, but that is where I get the error saying the model is missing the *-symbol.json file.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

落日海湾 2025-01-23 16:47:00

由于某种原因，AWS 决定不使其内置算法直接与 Neo 兼容...但是，您可以使用 model.tar.gz 输出文件重新设计网络参数，然后进行编译。

第 1 步：从 tar 文件中提取模型

import tarfile
#path to local tar file
model = 'ss_model.tar.gz'

#extract tar file 
t = tarfile.open(model, 'r:gz')
t.extractall()

这应该输出两个文件：
model_algo-1、model_best.params

将权重从 gluon model Zoo 加载到您选择的架构的网络中

在本例中，我使用 DeepLabv3 和 resnet50 通过

import gluoncv
import mxnet as mx
from gluoncv import model_zoo
from gluoncv.data.transforms.presets.segmentation import test_transform

model = model_zoo.DeepLabV3(nclass=2, backbone='resnet50', pretrained_base=False, height=800, width=1280, crop_size=240)
model.load_parameters("model_algo-1")

使用新模型进行预测来检查参数是否已正确加载

使用用于训练的图像。

#use cpu
ctx = mx.cpu(0)
#decode image bytes of loaded file
img = image.imdecode(imbytes)

#transform image
img = test_transform(img, ctx)
img = img.astype('float32')
print('tranformed image shape: ', img.shape)

#get prediction
output = model.predict(img)

将模型混合到 Sagemaker Neo 所需的输出中

额外检查图像形状兼容性

model.hybridize()
model(mx.nd.ones((1,3,800,1280)))
export_block('deeplabv3-res50', model, data_shape=(3,800,1280), preprocess=None, layout='CHW')

将模型重新编译为 tar.gz 格式

这包含 Neo 寻找的参数和 json 文件。

tar = tarfile.open("comp_model.tar.gz", "w:gz")
for name in ["deeplabv3-res50-0000.params", "deeplabv3-res50-symbol.json"]:
    tar.add(name)
tar.close()

将 tar.gz 文件保存到 s3，然后使用 Neo GUI 进行编译

For some reason, AWS has decided to not make its built-in algorithms directly compatible with Neo... However, you can re-engineer the network parameters using the model.tar.gz output file and then compile.

Step 1: Extract model from tar file

import tarfile
#path to local tar file
model = 'ss_model.tar.gz'

#extract tar file 
t = tarfile.open(model, 'r:gz')
t.extractall()

This should output two files:
model_algo-1, model_best.params

Load weights into network from gluon model zoo for the architecture that you chose

In this case I used DeepLabv3 with resnet50

import gluoncv
import mxnet as mx
from gluoncv import model_zoo
from gluoncv.data.transforms.presets.segmentation import test_transform

model = model_zoo.DeepLabV3(nclass=2, backbone='resnet50', pretrained_base=False, height=800, width=1280, crop_size=240)
model.load_parameters("model_algo-1")

Check the parameters have loaded correctly by making a prediction with new model

Use an image that was used for training.

#use cpu
ctx = mx.cpu(0)
#decode image bytes of loaded file
img = image.imdecode(imbytes)

#transform image
img = test_transform(img, ctx)
img = img.astype('float32')
print('tranformed image shape: ', img.shape)

#get prediction
output = model.predict(img)

Hybridise model into output required by Sagemaker Neo

Additional check for image shape compatibility

model.hybridize()
model(mx.nd.ones((1,3,800,1280)))
export_block('deeplabv3-res50', model, data_shape=(3,800,1280), preprocess=None, layout='CHW')

Recompile model into tar.gz format

This contains the params and json file which Neo looks for.

tar = tarfile.open("comp_model.tar.gz", "w:gz")
for name in ["deeplabv3-res50-0000.params", "deeplabv3-res50-symbol.json"]:
    tar.add(name)
tar.close()