如何将图像直方图存储到数据库中并能够执行搜索

发布于 2024-09-19 01:34:20 字数 144 浏览 5 评论 0原文

我需要编写网络应用程序，用户将能够根据图像颜色执行搜索。我的问题是如何存储颜色数据？我认为最好的解决方案是减少图像颜色并为每个 r、g 和 b 通道准备直方图，但我不知道如何设计数据库。我想使用 MySQL 数据库管理系统。有人能指出我正确的方向吗？

问候

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

吻安 2024-09-26 01:34:20

关于存储直方图数据，我想到了一些想法。显而易见的选择是使用一个表（或三个用于单独的 R/G/B 通道）来表示（标准化）直方图，每个 bin 有一列。如果您使用 24 位颜色（8 位/通道），则可以将每个通道分为 16 个容器（[0-15]、...、[240-255]），并在每列中存储像素百分比掉进那个垃圾箱里了。

像这样：

id  imgID  R_0_15 ... R_240_255 G_0_15 ... G_240_255 B_0_15 ... B_240_255
1   1234   0.1        0.23      0.023      0.234     0.11       0.01

通过这种设计，每个图像的整个（标准化）直方图将表示为表中的一行。

查询有点具有挑战性——您必须动态生成它们，以便为感兴趣的值范围插入正确的列名称。

也许更好的方法是使用 HistogramBins 表，其中每个图像和每个 bin 都有一个行条目：

id  imgID  component  bin_min  bin_max  percentage
1   1234   R          0        15       0.1
....omitted rows...
1   1234   R          240      255      0.23
...etc...

使用这种存储格式，可以准备查询而不是动态计算。我不清楚是否应该像我一样对组件进行分解，或者是否应该为所有三个颜色组件的“bin 1”存储一行。我可能想写一些查询，看看什么最适合您的应用程序。

另外，我一直说“标准化”的原因是这个方案将使你的分箱独立于图像大小。

希望这有助于您入门。让我们知道您最终会得到什么！

A couple of ideas come to mind for storing histogram data. The obvious choice is to have one table (or three for separate R/G/B channels) that represents the (normalized) histogram, with a column for each bin. If you're in 24 bit color (8 bits/channel), you could break each channel into 16 bins ([0-15], ..., [240-255]), and in each column store the percentage of pixels that fell into that bin.

Something like this:

id  imgID  R_0_15 ... R_240_255 G_0_15 ... G_240_255 B_0_15 ... B_240_255
1   1234   0.1        0.23      0.023      0.234     0.11       0.01

With this design, the entire (normalized) histogram for each image would be represented as a single row in the table.

Queries would be a bit challenging--you'd have to generate them dynamically to plug in the right column names for the value range of interest.

Perhaps a better way would be a HistogramBins table with a row entry for each image and each bin:

id  imgID  component  bin_min  bin_max  percentage
1   1234   R          0        15       0.1
....omitted rows...
1   1234   R          240      255      0.23
...etc...

With that storage format, queries could be prepared rather than dynamically computed. It's not clear to me whether the components should be broken out as I did or if you should store one row for "bin 1" of all three color components. I'd probably want to write some queries and see what felt best for your application.

Also, the reason I keep saying 'normalized' is that this scheme would make your binning independent of image size.

Hope this helps get you started. Let us know what you end up with!

回复收藏 0 原文

喜爱皱眉﹌ 2024-09-26 01:34:20

RGB 值对人类感知没有任何意义，但可以轻松转换为色相、饱和度、亮度这对人们来说更加明智。不幸的是，饱和度和亮度非常直观：更丰富：更苍白和更亮：更暗，但我们没有自然的颜色顺序，因此色调表示为围绕圆的任意度数。在实践中，要求人们进行精细的色调辨别是相当困难的，尤其是在寻找尚未见过的东西时。因此，您可能希望将类别限制为中的六边形顶点图“a”。

那么你就会遇到一个问题：照片的代表色是什么？一半是蓝天、一半是棕褐色的沙子的图像是蓝色还是棕褐色？您选择主色调吗？您可能想要应用巨大的高斯模糊，然后对所得色调进行平均。您可能需要进一步完善您的问题和目标。