将 base64 .docx 传递到 docx.Document 会导致 BadZipFile 异常
我正在python 3.9中编写一个Azure功能,该功能需要接受由已知的.docx文件创建的base64字符串,该字符串将用作模板。我的代码将解码base64,将其传递给字节实例,然后将其传递给docx.document()。但是,我正在收到一个异常badzipfile:文件不是zip文件
。
以下是我的代码的纤细版本。它在document = document(bytesiodoc)
上失败。我开始认为这是一个编码/解码的问题,但是我对此一无所知,无法进入解决方案。
from docx import Document
from io import BytesIO
import base64
var = {
'template': 'Some_base64_from_docx_file',
'data': {'some': 'data'}
}
run_stuff = ParseBody(body=var)
output = run_stuff.run()
class ParseBody():
def __init__(self, body):
self.template = str(body['template'])
self.contents = body['data']
def _decode_template(self):
b64Doc = base64.b64decode(self.template)
bytesIODoc = BytesIO(b64Doc)
document = Document(bytesIODoc)
def run(self):
self.document = self._decode_template()
我还尝试了以下更改为_decode_template
,并获得了相同的例外。这是在b64doc
对象上运行base64.decodebytes()
,并将其传递给bytesio
,而不是直接传递b64doc
。
def _decode_template(self):
b64Doc = base64.b64decode(self.template)
bytesDoc = base64.decodebytes(b64Doc)
bytesIODoc = BytesIO(bytesDoc)
我已经成功尝试了相同的.docx文件以确保这是可能的。我可以在Python中打开该文档,base64编码它,将其解码为字节,将其传递给字节实例,然后成功地将其传递给Docx.Document。
file = r'WordTemplate.docx'
doc = open(file, 'rb').read()
b64Doc = base64.b64encode(doc)
bytesDoc = base64.decodebytes(b64Doc)
bytesIODoc= BytesIO(bytesDoc)
newDoc = Document(bytesIODoc)
我尝试了无数其他解决方案,无法使我远离解决方案。这是我最接近的。任何帮助将不胜感激!
I'm writing an Azure function in Python 3.9 that needs to accept a base64 string created from a known .docx file which will serve as a template. My code will decode the base64, pass it to a BytesIO instance, and pass that to docx.Document(). However, I'm receiving an exception BadZipFile: File is not a zip file
.
Below is a slimmed down version of my code. It fails on document = Document(bytesIODoc)
. I'm beginning to think it's an encoding/decoding issue, but I don't know nearly enough about it to get to the solution.
from docx import Document
from io import BytesIO
import base64
var = {
'template': 'Some_base64_from_docx_file',
'data': {'some': 'data'}
}
run_stuff = ParseBody(body=var)
output = run_stuff.run()
class ParseBody():
def __init__(self, body):
self.template = str(body['template'])
self.contents = body['data']
def _decode_template(self):
b64Doc = base64.b64decode(self.template)
bytesIODoc = BytesIO(b64Doc)
document = Document(bytesIODoc)
def run(self):
self.document = self._decode_template()
I've also tried the following change to _decode_template
and am getting the same exception. This is running base64.decodebytes()
on the b64Doc
object and passing that to BytesIO
instead of directly passing b64Doc
.
def _decode_template(self):
b64Doc = base64.b64decode(self.template)
bytesDoc = base64.decodebytes(b64Doc)
bytesIODoc = BytesIO(bytesDoc)
I have successfully tried the following on the same exact .docx file to be sure that this is possible. I can open the document in Python, base64 encode it, decode into bytes, pass that to a BytesIO instance, and pass that to docx.Document successfully.
file = r'WordTemplate.docx'
doc = open(file, 'rb').read()
b64Doc = base64.b64encode(doc)
bytesDoc = base64.decodebytes(b64Doc)
bytesIODoc= BytesIO(bytesDoc)
newDoc = Document(bytesIODoc)
I've tried countless other solutions to no avail that have lead me further away from a resolution. This is the closest I've gotten. Any help is greatly appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
下面链接的问题的答案实际上帮助我解决了自己的问题。 如何在Python和Python和将其保存在内存中吗?
我要做的就是将
document = document = document(bytesiodoc)
更改为:The answer to the question linked below actually helped me resolve my own issue. How to generate a DOCX in Python and save it in memory?
All I had to do was change
document = Document(bytesIODoc)
to the following: