使用nvidia triton的字符串参数

发布于 2025-01-26 06:57:35 字数 1040 浏览 4 评论 0原文

我正在尝试在Triton推理服务器上部署一个简单的模型。它的加载良好，但是我很难格式化输入以执行适当的推理请求。

我的模型有一个config.pbtxt这样的设置，

  max_batch_size: 1
  input: [
    {
      name: "examples"
      data_type: TYPE_STRING
      format: FORMAT_NONE
      dims: [ -1 ]
      is_shape_tensor: false
      allow_ragged_batch: false
      optional: false
    }
  ]

我尝试使用非常简单的python代码来设置输入数据（输出不是编写，但已正确设置）

        bytes_data = [input_data.encode('utf-8')]
        bytes_data = np.array(bytes_data, dtype=np.object_)
        bytes_data = bytes_data.reshape([-1, 1])
        inputs = [
            httpclient.InferInput('examples', bytes_data.shape, "BYTES"),
        ]
        inputs[0].set_data_from_numpy(bytes_data)

同样错误消息

tritonclient.utils.InferenceServerException: Could not parse example input, value: '[my text input here]'
         [[{{node ParseExample/ParseExampleV2}}]]

，但是我一直收到与我的甚至是tfx用作{“实例”：[{“ b64”

我尝试编码输入的多种方法，例如字节，如果有人知道，问题来自哪里？

原文

I'm trying to deploy a simple model on the Triton Inference Server. It is loaded well but I'm having trouble formatting the input to do a proper inference request.

My model has a config.pbtxt set up like this

  max_batch_size: 1
  input: [
    {
      name: "examples"
      data_type: TYPE_STRING
      format: FORMAT_NONE
      dims: [ -1 ]
      is_shape_tensor: false
      allow_ragged_batch: false
      optional: false
    }
  ]

I've tried using a pretty straightforward python code to setup the input data like this (the outputs are not written but are setup correctly)

        bytes_data = [input_data.encode('utf-8')]
        bytes_data = np.array(bytes_data, dtype=np.object_)
        bytes_data = bytes_data.reshape([-1, 1])
        inputs = [
            httpclient.InferInput('examples', bytes_data.shape, "BYTES"),
        ]
        inputs[0].set_data_from_numpy(bytes_data)

But I keep getting the same error message

tritonclient.utils.InferenceServerException: Could not parse example input, value: '[my text input here]'
         [[{{node ParseExample/ParseExampleV2}}]]

I've tried multiple ways of encoding the input, as bytes or even as TFX serving used to ask like this { "instances": [{"b64": "CjEKLwoJdXR0ZXJhbmNlEiIKIAoecmVuZGV6LXZvdXMgYXZlYyB1biBjb25zZWlsbGVy"}]}

I'm not exactly sure where the problems comes from if anyone knows?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

单身情人 2025-02-02 06:57:35

我稍作修改了接受的示例。不必创建f.train.example - 您可以简单地将文本编码为字节，然后直接创建一个numpy数组。

np_input_data = np.asarray([str.encode(input_data)])

inputs = [tritonhttpclient.InferInput('TEXT', [1], "BYTES")]
inputs[0].set_data_from_numpy(np_input_data.reshape([1]), binary_data=False)

编辑：研究Triton代码后 - 特别是实现http.InferInput和triton_python_backend_utils.pys.py i i意识到可以通过使用dtype =对象，即

np_input_data = np.asarray([input_data], dtype=object)

text = tritonclient.http.InferInput('text', [1], "BYTES")
text.set_data_from_numpy(np_input_data.reshape([1]))

I modified the accepted example slightly. It's not necessary to create a f.train.Example - you can simply encode your text as bytes and create a numpy array directly.

np_input_data = np.asarray([str.encode(input_data)])

inputs = [tritonhttpclient.InferInput('TEXT', [1], "BYTES")]
inputs[0].set_data_from_numpy(np_input_data.reshape([1]), binary_data=False)

Edit: After studying the triton code - specifically the implementation of http.InferInput and triton_python_backend_utils.py I realised this can be simplified further by using dtype=object, i.e.

np_input_data = np.asarray([input_data], dtype=object)

text = tritonclient.http.InferInput('text', [1], "BYTES")
text.set_data_from_numpy(np_input_data.reshape([1]))

回复收藏 0 原文

只为一人 2025-02-02 06:57:35

如果有人遇到同样的问题，这解决了问题。我必须创建一个tf.train.example（）并正确设置数据

example = tf.train.Example()
example_bytes = str.encode(input_data)
example.features.feature['utterance'].bytes_list.value.extend([example_bytes])
inputs = [
    httpclient.InferInput('examples', [1], "BYTES"),
]
inputs[0].set_data_from_numpy(np.asarray(example.SerializeToString()).reshape([1]), binary_data=False)

If anyone gets this same problem, this solved it. I had to create a tf.train.Example() and set the data correctly

example = tf.train.Example()
example_bytes = str.encode(input_data)
example.features.feature['utterance'].bytes_list.value.extend([example_bytes])
inputs = [
    httpclient.InferInput('examples', [1], "BYTES"),
]
inputs[0].set_data_from_numpy(np.asarray(example.SerializeToString()).reshape([1]), binary_data=False)

回复收藏 0 原文

~没有更多了~