在逻辑应用中读取 Blob 存储中的 ANSI 编码文本文件时,字符已损坏

发布于 2025-01-09 03:59:00 字数 733 浏览 0 评论 0原文

我对保存到 blob 存储然后使用逻辑应用程序进行迭代的文本文件有疑问。这些文件来自一个非常古老(但不幸的是非常重要)的系统。对它的访问受到严格限制,因此我对如何创建文件的控制权为零。

它们上传到我们的 blob 存储,不带任何后缀并使用 ANSI 编码。如果我通过 Azure 门户查看其文件内容,我可以看到非标准字符(在本例中为 åäö)已损坏: [损坏的字符][1]

我认为这是因为 Azure 假定 UTF-8 编码?当我使用逻辑应用程序迭代 blob,然后获取文件内容并将它们放入字符串变量中时,就会出现问题。据我了解,Azure逻辑应用程序应该自动将编码转换为UTF-8,但这似乎并没有发生,因为字符串变量中的字符åäö仍然是乱码。

字符串变量中的数据用作需要能够正确查看 åäö 字符的 Azure 函数的输入数据。在上传文件之前手动将文件转换为 UTF-8 可以解决问题,但这并不实际,因为此数据流应该是自动化的。

文件内容的提取如下: [文件内容到变量][2]

推断内容没有区别,使用正确的 .txt 后缀重命名文件也没有区别。 [1]: https://i.sstatic.net/5B2ru.png [2]: https://i.sstatic.net/zb6yj.png

I have an issue with text files that are save to a blob storage and then iterated through using a logic app. The files come from a very old (but unfortunately very important) system. Access to it is highly restricted, so I have zero control over how the files are created.

They are uploaded to our blob storage without any suffix and with ANSI encoding. If I view their file contents through the Azure portal I can see that non-standard characters (in this case åäö) are corrupted :
[corrupted chars][1]

I assume this is because Azure assumes UTF-8 encoding? The problem occurs when I use a logic app to iterate through the blobs, then getting the file contents and placing them in a string variable. As far as I understood it, Azure logic apps should automatically convert the encoding to UTF-8, but this doesn't seem to happen, since the characters åäö in the string variable are still garbled.

The data from the string variable is used as input data for an Azure function that needs to be able to see the åäö characters properly. Manually converting the files to UTF-8 before uploading them solves the issue, but this is not practical, since this data flow is supposed to be automated.

The file contents are extracted like so:
[file contents to variable][2]

Infer content makes no difference, neither does renaming the files with a proper .txt suffix.
[1]: https://i.sstatic.net/5B2ru.png
[2]: https://i.sstatic.net/zb6yj.png

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

柠北森屋 2025-01-16 03:59:00

我的 windows.1252 编码文件也有同样的问题。在“获取 blob 内容 (v2)”中添加了参数“推断内容类型”。将其设置为“否”,这对我来说已经解决了。在 csv 文件中得到了我的 æøå,而不是一些可怕的 UTF-8 编码。

Had the same problem with my windows.1252 encoded files. Added the parameter "Infer content type" in "Get blob content (v2)". Set it to No and it was solved for me. Got my æøå in the csv file and not some horrible UTF-8 encoding.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文