使用 base64_decode 解码 XML 在 PHPUnit 中工作正常,但在浏览器中返回 UTF-16 编码数据

发布于 2024-09-10 07:40:52 字数 989 浏览 1 评论 0 原文

我在使用 PHP 的 base64_decode 函数:

  1. 在我们的 PHPUnit 测试中,我们可以解码 XML 并将其回显到控制台,并按照您的预期打印 XML(所有单元测试也都通过)。
  2. 当我们尝试在浏览器中运行相同的代码时,解码后的 XML 似乎包含大量 UTF-16 字符,其中散布着预期的 XML 标记片段。例如:

正如您可能期望的那样,当将此字符串传递给SimpleXMLElement 构造函数

一些进一步的信息:

  • XML 本身来自外部登录系统,我们无法控制它的格式;它没有附带任何 声明,根节点是这个...标签。
  • 我检查了所提供页面的字符编码并验证它是 UTF-8。
  • 正在开发的站点正在使用 Drupal
  • 我们尝试通过 Drupal 的 drupal_convert_to_utf8 传递 XML / UTF-16 字符串 函数,但这只是返回中文(我认为)符号,例如敲

有人以前遇到过类似的事情或者知道可能导致这种情况的原因吗?

I'm having some strange issues with decoding an XML snippet, contained with a cookie, with PHP's base64_decode function:

  1. In our PHPUnit tests, we can decode the XML and echo it out to the console and it prints XML as you would expect (all unit tests pass as well).
  2. As soon as we try running the same code in the browser, the decoded XML appears to contain loads of UTF-16 characters interspersed with fragments of the expected XML tags. For example:

    <CreateSession\u000f\u0013Y...

As you might then expect, we end up with an Exception: String could not be parsed as XML... error when passing this string to the SimpleXMLElement constructor.

Some further info:

  • The XML itself comes from an external login system and we don't have any control over it's format; it doesn't come with any <?xml...?> declaration and the root node is this <CreateSession>...</CreateSession> tag.
  • I've checked the character encoding of the page being served and have verified that it is UTF-8.
  • The site being developed is using Drupal
  • We tried passing the XML / UTF-16 string through Drupal's drupal_convert_to_utf8 function, but this just returns the Chinese (I think) symbols e.g. 敲

Has anyone come across anything like this before or have any idea what might be causing this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

白鸥掠海 2024-09-17 07:40:52

啊哈!事实证明,当在浏览器中运行时,cookie 值会自动 URL 解码 由 PHP 处理,这意味着 base64 编码文本中的任何“+”都将被空格替换。在调用 base64_decode 之前添加这行代码修复了问题:

$tmp = str_replace(' ', '+', $value);

Aha! It turns out that, when run in the browser, the cookie values were automatically URL decoded by PHP, meaning that any '+' in the base64 encoded text were being replaced by spaces. Adding this line of code before calling base64_decode fixed things:

$tmp = str_replace(' ', '+', $value);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文