为什么Outlook编程接口给出的附件大小总是错误的?
尝试在 C# 中使用 Outlook Interop 时,我注意到一件奇怪的事情。
- 首先,我使用 附件 获取附件的大小.Size属性。
- 其次,我使用 将附件保存到文件中Attachment.SaveAsFile 方法。
比较保存文件的实际大小和 Outlook 给出的大小,我注意到实际保存的文件总是小于 Attachment.Size
的预期大小。保存的文件似乎有效并且没有被截断。
示例结果 http://www.freeimagehosting.net/uploads/224d342eba。 png
那么,它有什么问题呢? Attachment.Size
中是否存在错误?或者也许期望提供附件大小以外的其他内容?
我认为它将 CR 转换为 CRLF,包括二进制文件,这可能可以解释开销,但有些附加文件是带有 CRLF 的原始文本格式,所以这个假设是错误的。
第一次编辑:
它不是 Base64 编码,因为 Base64 编码将是:
- 4/3 比例。就我而言,我的比率与 1.0 相差不远。
- 成比例。此处的情况并非如此:1.9 MB 文件的开销为 181 字节,而 27 KB 文件的开销为 3 KB。
现在,看看 89 到 3658 字节范围内的几乎随机的开销,我同意这可能是一些奇怪的标头。
第二次编辑:
我在更大的文件集上对此进行了测试。我注意到,实际文件大小与 Outlook 给出的大小之间的差异:
- 对于 .msg 附件,始终为零。但 .msg 附件是一种非常特殊的情况,并且具有非常奇怪的行为。
- 受文件扩展名和文件名长度的影响。
- 对于相同的文件扩展名,在大多数情况下(但并非总是如此),当文件名长度较大时,它也越大。
以下是一个示例:
替代文本 http://www.freeimagehosting。 net/uploads/a767d3cacf.png
恕我直言,Outlook 对文件名执行一些操作,某种非常奇怪的编码,可能会根据文件生成唯一标识符名称。这意味着:
- 当文件越大时,唯一标识符也越大。
- 当发生冲突时,唯一标识符会发生一些变化,使其变得非常非常大:第 18 行与第 11 行具有相同的文件名,但文件不同;另一方面,第 12、13 和 14 行具有相同的文件。
Trying to use Outlook Interop in C#, I noticed a curious thing.
- First I get the size of an attachment with Attachment.Size property.
- Second, I save the attachment to a file using Attachment.SaveAsFile method.
Comparing the real size of a saved file and a size given by Outlook, I notice that the real, saved file is always smaller than expected from Attachment.Size
. The saved files seem to be valid and not truncated.
Sample results http://www.freeimagehosting.net/uploads/224d342eba.png
So, what's wrong with it? Is there a bug in Attachment.Size
? Or maybe it is expected to give something other than the size of an attachment?
I thought it converts CR to CRLF, including binary files, which may explain the overhead, but some attached files are in raw text format with CRLF, so this hypothesis is wrong.
First edit:
It is not Base64 encoding, because Base64 encoding would be:
- 4/3 ratio. In my case, I have a ratio which is not so far from 1.0.
- Proportional. It is not the case here: a 1.9 MB file has an overhead of 181 bytes, whereas a 27 KB file has an overhead of 3 KB.
Now, looking at nearly random overhead in a range of 89 to 3658 bytes, I would agree that it might be some strange headers.
Second edit:
I tested this on a larger set of files. What I notice is that the difference between real file size and size given by Outlook:
- Is always zero for an .msg attachment. But .msg attachment is a very special case and have a very strange behavior.
- Is influenced by both file extension and the length of file name.
- For the same file extension, is, in most cases, but not always, bigger when the file name length is bigger.
Here is an example:
alt text http://www.freeimagehosting.net/uploads/a767d3cacf.png
IMHO, Outlook does something with the name of the file, some sort of very strange encoding, maybe a generation of an unique identifier based on file name. This means that:
- when the file is bigger, the unique identifier is bigger too.
- when collision happens, something happens to the unique identifier, making it much, much bigger: row 18 has the same file name as row 11, but the file is not the same; on the other hand, rows 12, 13 and 14 have the same file.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我不确定,但我假设它可能是 MIME 标头和/或编码开销。有关详细信息,请查看这篇有关 Base64 的 Wiki 文章并搜索单词 Overhead。
编辑:抱歉,我不是很清楚,我的意思是 Base64 文章只是作为一个示例,可能存在与编码相关的开销,而不是它实际上是 Base64,因为正如其他人提到的,Base64 开销可能会大得多比那些差异。
I'm not sure but I'd assume that it might be MIME headers and/or encoding overhead. For more information, look at this Wiki article about Base64 and search for the word overhead.
Edit: Sorry, I wasn't very clear, I meant the Base64 article just as an example of that there might be overhead related to encoding, not that it was actually Base64 since, as mentioned by others, Base64 overhead would probably be much larger than those differences.