CRC 的生成速度有多快?
我需要为网络上的图像文件生成 etag。我想到的可能解决方案之一是计算图像文件的 CRC,然后将其用作 etag。
这需要每次有人在服务器上请求图像时计算 CRC,因此快速完成非常重要。
那么,生成 CRC 的算法有多快?或者这是一个愚蠢的想法?
I need to generate etags for image files on the web. One of the possible solutions I thought of would be to calculate CRCs for the image files, and then use those as the etag.
This would require CRCs to be calculated every time someone requests an image on the server, so its very important that it can be done fast.
So, how fast are algorithms to generate CRCs? Or is this a stupid idea?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
请改用更强大的哈希算法,例如 SHA1。
速度取决于图像的大小。大多数时间将花费在从磁盘加载数据上,而不是CPU处理上。您可以缓存生成的哈希值。
但我也建议根据文件的上次更新日期创建 etag,这样速度更快,并且不需要加载整个文件。
请记住,etag 只能对于特定资源是唯一的,因此如果两个不同的图像具有相同的上次更新时间,那就没问题。
Use instead a more robust hashing algo such as SHA1.
Speed depends on the size of the image. Most time will be spent on loading data from the disk, rather than in CPU processing. You can cache your generated hashes.
But I also advise on creating etag based on last update date of the file which is much quicker and does not require loading the whole file.
Remember, etag must only be unique for a particular resource so if two different images have the same last update time, it is fine.
大多数实现使用上次修改日期或其他文件头作为 ETag,包括 微软自己的,我建议你使用该方法。
Most implementations use the last modified date or other file headers as the ETag including Microsoft's own, and I suggest you use that method.
取决于所使用的方法和长度。一般来说相当快,但为什么不缓存它们呢?
如果文件的更改频率不会超过用于存储文件的系统的分辨率(即文件系统的文件修改时间或 SQLServer 日期时间(如果存储在数据库中)),那么为什么不直接使用相关决议的修改日期?
我知道 RFC 2616 建议不要使用时间戳,但这只是因为 HTTP 时间戳的分辨率为 1 秒,而且更改可能会比这更频繁。但是:
通过这种方法,您可以保证获得唯一的电子标签(大 CRC 不太可能发生冲突,但肯定有可能),这正是您想要的。
当然,如果您从不更改给定 URI 处的图像,则更容易,因为您可以只使用固定字符串(我更喜欢字符串“immutable”)。
Depends on the method used, and the length. Generally pretty fast, but why not cache them?
If there won't be changes to the files more often than the resolution of the system used to store it (that is, of file modification times for the filesystem or of SQLServer datetime if stored in a database), then why not just use the date of modification to the relevant resolution?
I know RFC 2616 advises against the use of timestamps, but this is only because HTTP timestamps are 1sec resolution and there can be changes more frequent than that. However:
With this approach you are guaranteed a unique e-tag (collisions are unlikely with a large CRC but certainly possible), which is what you want.
Of course, if you don't ever change the image at a given URI, it's even easier as you can just use a fixed string (I prefer string "immutable").
我建议在将图像添加到数据库一次时计算哈希值,然后通过 SELECT 将其与图像本身一起返回。
如果您使用的 Sql Server 和图像不是很大(最大 8000 字节),您可以利用 HASHBYTES() 函数能够生成 SHA-1、MD5、...
I would suggest calculate hash when adding a image into a data base once and then just return it by SELECT along with a image itself.
If you are usign Sql Server and images not very large (max 8000 bytes) you can leverage HASHBYTES() function which able to generate SHA-1, MD5, ...