Google Documentai-＆gt; ValueError：协议消息文档没有“文件”。场地

发布于 2025-01-18 05:00:36 字数 506 浏览 6 评论 0 原文

在我的脚本中，我有以下内容：

response = requests.get(list_url[0], allow_redirects=True)
s = io.BytesIO()
s.write(response.content)
s.seek(0)
mimetype="application/octet-stream"
document = {'file': s.read(), 'mime': mimetype}
request = {"name": name, "document": document}

但是，当我向服务器发送请求时：

result = client.process_document(request=request)

我收到 ValueError: Protocol message Document has no “file” field。

这是因为 google docAI 不接受八位字节流吗？

原文

In my script, I have the following:

response = requests.get(list_url[0], allow_redirects=True)
s = io.BytesIO()
s.write(response.content)
s.seek(0)
mimetype="application/octet-stream"
document = {'file': s.read(), 'mime': mimetype}
request = {"name": name, "document": document}

However, when I send a request to the server:

result = client.process_document(request=request)

I get ValueError: Protocol message Document has no "file" field.

Is this due because google docAI doesn't accept octet-stream?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

何处潇湘 2025-01-25 05:00:36

我检查了文档ai python客户端DocumentProcessorServiceClient的最新版本代码，发现该函数在其request字段上传递了一个Process Request对象。您可以在 process_document github 代码页。

处理请求会接受inline_document 或 raw_document （两者是互斥的）。根据您的代码，您似乎正在传递 raw_document 只接受应该使用字段 content 和 mime_type 来代替 file 和 mime。

如果您查看使用 python 库客户端处理文档 ai 的示例，您会发现会发现这几行解释了它应该如何实现：

...
    document = {"content": image_content, "mime_type": "application/pdf"}

    # Configure the process request
    request = {"name": name, "raw_document": document}

    result = client.process_document(request=request)
...

有关其他详细信息，您可以查看官方 github 项目的 document ai 和 Python 客户端库。

I checked the latest version code of the document ai python client DocumentProcessorServiceClient and found this function pass on its request field a Process Request object. You can check details of that function on the process_document github code page.

Process Request will accept either inline_document or a raw_document (both are mutual exclusive). Based on your code it looks like you are passing a raw_document which only accepts fields content and mime_type which should be used instead of file and mime.

If you check the sample of using python library client for document ai you will find this lines which explain how it should be implemented:

...
    document = {"content": image_content, "mime_type": "application/pdf"}

    # Configure the process request
    request = {"name": name, "raw_document": document}

    result = client.process_document(request=request)
...

For additional details, you can check the official github project for document ai and the official google page for the python client library.