C#：将 byte[] 转换为 UTF8 编码字符串

发布于 2024-09-12 15:41:11 字数 1099 浏览 6 评论 0原文

我正在使用一个名为 EXIFextractor 的库从图像中提取元数据信息。该库部分使用 System.Drawing.Imaging.PropertyItem 来完成所有艰苦的工作。根据 Microsoft 文档，PropertyItem 中的某些数据（例如图像详细信息等）是作为存储在 byte[] 中的 ASCII 字符串获取的。

我的问题是国际字符（å、ä、ö 等）被删除并替换为问号。当我调试代码时，很明显 byte[] 是 UTF-8 的表示。

我想将 byte[] 解析为 UTF8 字符串，如何才能在过程中不丢失任何信息的情况下执行此操作？

提前致谢！

更新：

我被要求提供代码片段：

第一个片段来自我使用的类，即由 Asim Goheer 编写的 EXIFextractor.cs

foreach( System.Drawing.Imaging.PropertyItem p in parr )
{
 string v = ""; 

                // ...

 else if( p.Type == 0x2 )
 {
  // string     
  v = ascii.GetString(p.Value);
 }

这是我的代码中我尽力处理上述结果。

                try {
  EXIFextractor exif = new EXIFextractor(ref bmp, "");
  object o;
                    if ((o = exif["Image Description"]) != null)
                        MediaFile.Description = Tools.UTF8Encode(o.ToString());

我还尝试了其他几种方法来从数据中获取我宝贵的 å, ä, ö，但似乎没有任何效果。我开始认为汉斯·帕桑特在下面的回答中得出的结论是正确的。

原文

I am using a library called EXIFextractor to extract metadata information from images. This lib in part is using System.Drawing.Imaging.PropertyItem to do all the hard work. Some of the data in PropertyItem, such as Image Details etcetera, are fetched as an ASCII-string stored in a byte[] according to the Microsoft documentation.

My problem is that international characters (å, ä, ö, etcetera) are dropped and replaced by questionmarks. When I debug the code it is apparent that the byte[] is a representation of an UTF-8.

I'd like to parse the byte[] as an UTF8-string, how can I do this without loosing any information in the process?

Thanks in advance!

Update:

I have been asked to provide a snippet from my code:

The first snippet is from the class I use, namely the EXIFextractor.cs written by Asim Goheer

foreach( System.Drawing.Imaging.PropertyItem p in parr )
{
 string v = ""; 

                // ...

 else if( p.Type == 0x2 )
 {
  // string     
  v = ascii.GetString(p.Value);
 }

And this is my code where I try my best to handle the results of the above.

                try {
  EXIFextractor exif = new EXIFextractor(ref bmp, "");
  object o;
                    if ((o = exif["Image Description"]) != null)
                        MediaFile.Description = Tools.UTF8Encode(o.ToString());

I have also tried a couple of other ways of getting my precious å, ä, ö from the data, but nothing seems to do the trick. I am starting to think Hans Passant is right about his conclusions in his answer below.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

戏舞 2024-09-19 15:41:11

string yourText = System.Text.Encoding.UTF8.GetString(yourByteArray);

string yourText = System.Text.Encoding.UTF8.GetString(yourByteArray);

回复收藏 0 原文

陌上芳菲 2024-09-19 15:41:11

使用 GetString 方法 < a href="http://msdn.microsoft.com/en-us/library/system.text.encoding.utf8.aspx" rel="nofollow noreferrer">Encoding.UTF8 对象。

回复收藏 0 原文

流年已逝 2024-09-19 15:41:11

是的，这是生成图像的应用程序或相机的问题。 EXIF 标准对文本的支持很糟糕，它必须以 ASCII 进行编码。只有当摄影师说英语时，效果才会很好。毫无疑问，对图像进行编码的软件忽略了这一要求。这也是 PropertyItem 类正在做的事情，它使用 Marshal.StringToHGlobalAnsi() 将字符串编码为 byte[]，它采用系统的默认代码页。

这个问题没有明显的解决办法，当照片距离你的机器太远时，你会得到 mojibake。

回复收藏 0 原文