我的公司最近开始遇到网站图像处理问题。
我们有几个网站(成人娱乐),显示 DVD 封面、快照等类似图像。我们有大约 100,000 部电影,每部电影平均有 30 个快照 + 封面。几乎每张图像都有一个针对非会员的模糊和叠加的附加版本,这导致每部电影大约有 50 张图像或总共 500 万张基本图像。每个图像都有多个版本,具体取决于它在页面上的位置(缩略图、原始图像、小预览、不那么小的预览、顶部列表中的小图像等),这会产生比我想数一数。
现在我有了使用服务器动态生成图像的想法,因为为所有不同页面生成所有不同图像变得非常笨拙(因为不同页面有时甚至需要不同的图像大小来完成基本相同的任务) 。
有谁知道有一种图像处理服务器可以动态缩小图像,这样我们只需要提供原始图像,网络人员就可以请求他们需要的任何尺寸?
要求:
- 非常高性能(每天几千个用户)
- 即时模糊和叠加创建
- 即时调整大小(保持和不保持纵横比)
- 可以处理数百万张图像
- 必须能够读取 JPG、GIF、PNG和 BMP 并在它们之间进行转换
安全性并不是那么重要,因为通过 URL 操作已经可以到达不模糊的图像,更多的安全性会很好,但这不是必需的,坦率地说,我不再关心(在未能进入我的同事头脑之后)为什么(对于我们的小型经销商页面)使用 http: 是一个坏主意//example.com/view_image.php?filename=/data/images/01020304.jpg 显示图像)。
我们尝试使用 PHP 脚本来执行此操作,但对于如此多的用户来说,性能太慢。
预先感谢您的任何建议。
my company has recently started to get problems with the image handling for our websites.
We have several websites (adult entertainment) that display images like dvd covers, snapshots and similar. We have about 100'000 movies and for each movie we have an average of 30 snapshots + covers. Almost every image has an additional version with blurring and overlay for non-members, this results in about 50 images per movie or a total of 5 million base images. Each of the images is available in several versions, depending on where it's placed on the page (thumbnail, original, small preview, not-so-small preview, small image in the top-list, etc.) which results in more images than i cared to count.
Now i had the idea to use a server to generate the images on-the-fly since it became quite clumsy to generate all the different images for all the different pages (as different pages sometimes even need different image sizes for basically the same task).
Does anyone know of an image processing server that can scale down images on-the-fly so we only need to provide the original images and the web guys can just request whatever size they need?
Requirements:
- Very High performance (Several thousand users per day)
- On-the-fly blurring and overlay creation
- On-the-fly resize (with and without keeping aspect ratio)
- Can handle millions of images
- Must be able to read JPG, GIF, PNG and BMP and convert between them
Security is not that much of a concern as i.e. the unblurred images can already be reached by URL manipulation and more security would be nice but it's not required and frankly i stopped caring (After failing to get into my coworkers heads why (for our small reseller page) it's a bad idea to use http://example.com/view_image.php?filename=/data/images/01020304.jpg to display the images).
We tried PHP scripts to do this but the performance was too slow for this many users.
Thanks in advance for any suggestions you have.
发布评论
评论(8)
我建议您设置一个专用的网络服务器来处理图像调整大小并提供最终结果。我也做过类似的事情,尽管规模要小得多。它基本上消除了检查缓存的过程。
它的工作原理如下:
http://imageserver/someimage.150x120.jpg
编辑:我不认为 PHP 本身会减慢进程太多,因为在这种情况下 PHP 脚本被简化为最小值:图像缩放是由用 C 编写的内置库完成的。无论你做什么,你都必须使用这样的库(GD 或 libmagick 等),所以这是不可避免的。在我的系统中,至少你完全跳过了检查缓存的开销,从而进一步减少了 PHP 交互。您可以在现有服务器上实现此功能,因此我认为这是一个非常适合您预算的解决方案。
I suggest you set up a dedicated web server to handle image resize and serve the final result. I have done something similar, although on a much smaller scale. It basically eliminates the process of checking for the cache.
It works like this:
http://imageserver/someimage.150x120.jpg
EDIT: I don't think that PHP itself would slow the process much, as PHP scripting in this case is reduced to a minimum: the image scaling is done by a builtin library written in C. Whatever you do you'll have to use a library like this (GD or libmagick or so) so that's unavoidable. With my system at least you totally skip the overhead of checking the cache, thus further reducing PHP interaction. You can implement this on your existing server, so I guess it's a solution well suited for your budget.
基于
我假设您没有缓存结果。我建议将生成的图像缓存一两天(即让您的脚本检查缩略图是否已生成,如果已生成,则使用它,如果尚未动态生成)。
这将显着提高性能,因为我想象主/起始页面可能比随机视频 X 有更多的点击量,因此在查看主页时无需创建缓存的图像。当用户 Y 观看电影 X 时,他们不会注意到延迟那么多,因为它只需要生成一页。
对于“即时调整大小”方面 - 带宽对您来说有多重要?我想假设您已经浏览了很多电影,因此每个请求中额外的几 kb 图像不会造成太大伤害。如果是这种情况,您可以使用较大的图像并设置宽度和高度,然后让浏览器为您进行缩放。
Based on
I'm going to assume you weren't caching the results. I'd recommend caching the resulting images for a day or two (i.e. have your script check to see if the thumbnail has already been generated, if so use it, if it hasn't generate it on the fly).
This would improve performance dramatically as I'd imagine the main/start page probably has a lot more hits than random video X, thus when viewing the main page no images have to be created as they're cached. When User Y views Movie X, they won't notice the delay as much since it just has to generate that one page.
For the "On-the-fly resize" aspect - how important is bandwidth to you? I'd want to assume you're going through so much with movies that a few extra kb in images per request wouldn't do too much harm. If that's the case, you could just use larger images and set the width and height and let the browser do the scaling for you.
ImageCache 和 Drupal 社区的“nofollow noreferrer">图像精确大小解决方案可能会执行此操作,并且与大多数解决方案一样,OSS 使用来自 ImageMagik
有一些 AMI 镜像可供 Amazon EC2 服务进行图像缩放。它使用 Amazon S3 进行图像存储、原始图像和缩放,并可以将它们提供给 Amazon 的 CDN 服务(Cloud Front)。检查 EC2 网站上是否有可用的内容
另一个选择是 Google。 Google 文档现在支持所有文件类型,因此您可以将图像加载到 Google 文档文件夹,并共享该文件夹以供公众访问。 URL有点长,例如
http://lh6.ggpht.com/VM LEHAa3kSHEoRr7AchhQ6HEzHVTn1b7Mf-whpxmPlpdrRfPW216UhYdQy3pzIe4f8Q7PKXN79AD4eRqu1obC7I
添加 =s 参数来缩放图像,酷!例如200像素宽
http://lh6.ggpht .com/VMLEHAa3kSHEoRr7AchhQ6HEzHVTn1b7Mf-whpxmPlpdrRfPW216UhYdQy3pzIe4f8Q7PKXN79AD4eRqu1obC7I=s200
Google 仅收取 20GB 每年 5 美元的费用。有一个完整的 API 用于上传文档等
其他答案
如何最好地在服务器外调整图像大小
The ImageCache and Image Exact Sizes solutions from the Drupal community might do this, and like most solutions OSS use the libraries from ImageMagik
There are some AMI images for Amazons EC2 service to do image scaling. It used Amazon S3 for image storage, original and scales, and could feed them through to Amazons CDN service (Cloud Front). Check on EC2 site for what's available
Another option is Google. Google docs now supports all file types, so you can load the images up to a Google docs folder, and share the folder for public access. The URL's are kind of long e.g.
http://lh6.ggpht.com/VMLEHAa3kSHEoRr7AchhQ6HEzHVTn1b7Mf-whpxmPlpdrRfPW216UhYdQy3pzIe4f8Q7PKXN79AD4eRqu1obC7I
Add the =s paramter to scale the image, cool! e.g. for 200 pixels wide
http://lh6.ggpht.com/VMLEHAa3kSHEoRr7AchhQ6HEzHVTn1b7Mf-whpxmPlpdrRfPW216UhYdQy3pzIe4f8Q7PKXN79AD4eRqu1obC7I=s200
Google only charge USD5/year for 20GB. There is a full API for uploading docs etc
Other answers on SO
How best to resize images off-server
好吧,第一个问题是用任何语言调整图像大小都需要一点处理时间。那么,您如何为成千上万的客户提供支持呢?我们将为您缓存它,这样您只需生成图像一次。下次有人请求该图像时,检查它是否已经生成,是否刚刚返回。如果您有多个应用程序服务器,那么您将需要缓存到中央文件系统,以提高缓存命中率并减少所需的空间量。
为了正确缓存,您需要使用可预测的命名约定,该约定考虑到您希望图像显示的所有不同方式,即使用 myimage_blurred_320x200.jpg 之类的内容来保存已模糊并调整大小为 300 宽度和 200 的 jpeg 。
另一种方法是将您的图像服务器置于代理服务器后面,这样所有缓存逻辑都会自动为您完成,并且您的图像由快速的本机 Web 服务器提供服务
您将无法以任何其他方式提供数百万张调整大小的图像。谷歌和必应地图就是这样做的,它们在不同的预设范围内预先生成世界所需的所有图像,这样它们就可以提供足够的性能并能够返回预先生成的静态图像。
如果 php 太慢,您应该考虑使用 Java 或 .NET 的 2D 图形库,因为它们非常丰富并且可以支持您的所有要求。为了了解 Graphics API,这里提供了 .NET 中的一个方法,该方法会将任何图像的大小调整为指定的新宽度或高度。如果省略高度或宽度,它将调整大小以保持正确的纵横比。注意 图像可以从 JPG、GIF、PNG 或 BMP 创建:
Ok first problem is that resizing an image with any language takes a little processing time. So how do you support thousands of clients? We'll you cache it so you only have to generate the image once. The next time someone asks for that image, check to see if it has already been generated, if it has just return that. If you have multiple app servers then you'll want to cache to a central file-system to increase your cache-hit ratio and reduce the amount of space you will need.
In order to cache properly you need to use a predictable naming convention that takes into account all the different ways that you want your image displayed, i.e. use something like myimage_blurred_320x200.jpg to save a jpeg that has been blurred and resized to 300 width and 200 height, etc.
Another approach is to sit your image server behind a proxy server that way all the caching logic is done automatically for you and your images are served by a fast, native web server.
Your not going to be able to serve millions of resized images any other way. That's how Google and Bing maps do it, they pre-generate all the images they need for the world at different pre-set extents so they can provide adequate performance and be able to return pre-generated static images.
If php is too slow you should consider using the 2D graphic libraries from Java or .NET as they are very rich and can support all your requirements. To get a flavour of the Graphics API here is a method in .NET that will resize any image to the new width or height specified. If you omit a height or width, it will resize maintaining the right aspect ratio. Note Image can be a created from a JPG, GIF, PNG or BMP:
当这个问题被提出时,一些公司如雨后春笋般涌现来解决这个问题。这不是您或您的公司独有的问题。许多公司都需要寻找更持久的解决方案来满足其图像处理需求。
imgix 等服务可充当调整大小和应用叠加等图像操作的代理和 CDN。通过操作 URL,您可以对每个图像应用不同的转换。 imgix 每天处理数十亿个请求。
您还可以自行建立服务并将其置于 CDN 后面。像 imageproxy 这样的开源项目非常适合此目的。这给您的运营团队带来了维护负担。
(免责声明:我为 imgix 工作。)
In the time that this question has been asked, a few companies have sprung up to deal with this exact issue. It is not an issue that's isolated to you or your company. Many companies reach the point where they need to look for a more permanent solution for their image processing needs.
Services like imgix serve as a proxy and CDN for image operations like resizing and applying overlays. By manipulating the URL, you can apply different transformations to each image. imgix serves billions of requests per day.
You can also stand up services on your own and put them behind a CDN. Open source projects like imageproxy are good for this. This puts the burden of maintenance on your operations team.
(Disclaimer: I work for imgix.)
Thumbor 与您要查找的内容最匹配 http://thumbor.readthedocs.org/en /latest/index.html ,它是开源的,由一家大公司支持(意味着它明天不会消失),并且附带了许多不错的功能,例如在裁剪时检测图像上重要的内容。
对于低成本加 CDN,我建议将其与 Cloudfront 和 AWS 存储结合起来,或者将类似的解决方案与免费 CDN(如 Cloudflare)结合起来。这些可能不是性能最好的 CDN 提供商,但至少仍然比一台服务器性能更好,并且还可以便宜地卸载图像服务器。另外,它还可以为您节省大量带宽成本。
What you are looking for is best matched by Thumbor http://thumbor.readthedocs.org/en/latest/index.html , which is open source, backed by a huge company (means it will not disappear tomorrow), and ships with a lot of nice features like detecting what is important on an image when cropping.
For low-cost plus CDN I'd suggest to combine it with Cloudfront and AWS storage, or a comparable solution with a free CDN like Cloudflare. These might not be the best performing CDN providers, but at least still perform better than one server and also offload your image server on the cheap. Plus, it will save you a TON of bandwidth cost.
如果每个不同的图像都可以通过单个 URL 唯一识别,那么我只需使用 CDN(例如 AKAMAI)即可。让您的 PHP 脚本完成这项工作,并让 AKAMAI 处理负载。
由于此类业务通常不存在预算问题,因此这是我唯一会考虑的地方。
编辑:只有当您找到可以为您提供此类内容的 CDN 时,此操作才有效。
If each different image is uniquely identifiable by a single URL then I'd simply use a CDN such as AKAMAI. Let your PHP script do the job and let AKAMAI handle the load.
Since this kind of business doesn't usually have budget problems, that'd be the only place I'd look at.
Edit: that works only if you do find a CDN that will serve this kind of content for you.
现在,专用于此任务的图像调整大小服务正在解决这个完全相同的问题。它们提供以下功能:
其中一项服务是 Gumlet。您还可以尝试一些开源替代方案,例如 nginx 插件,它也可以动态调整图像大小。
(我为古姆莱特工作。)
This exact same problem is now being solved by image resize services dedicated to this task. They provide following features:
One such service is Gumlet. You can also try some open source alternative like nginx plugin which can also resize image on the fly.
(I work for Gumlet.)