如何将图像输入到神经网络?

发布于 2024-08-18 22:47:49 字数 183 浏览 3 评论 0原文

我了解神经网络的工作原理,但如果我想将它们用于图像处理(例如实际字符识别),我无法理解如何将图像数据输入到神经网络。

我有一个很大的 A 字母图像。也许我应该尝试从图像中获取一些信息/规格,然后使用该规格的值向量?它们将成为神经网络的输入?

谁已经做过这样的事情,你能解释一下如何做到这一点吗?

I understand how neural networks work, but if I want to use them for image processing like actual character recognition, I can't understand how can I input the image data to the neural net.

I have a very big image of an A letter. Maybe I should try to get some info/specifications from the image and then use a vector of values of that specification? And they will be the input for the neural net?

Who has already done such a thing, can you explain how to do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

瘫痪情歌 2024-08-25 22:47:49

最简单的解决方案是将所有图像(无论是训练还是测试)标准化为具有相同的分辨率。此外,每个图像中的字符大小应该大致相同。使用灰度图像也是一个好主意,因此每个像素只会给您一个数字。然后,您可以使用每个像素值作为网络的一个输入。例如,如果您的图像大小为 16x16 像素,则您的网络将具有 16*16 = 256 个输入神经元。第一个神经元将看到 (0,0) 处的像素值,第二个神经元将看到 (0,1) 处的像素值,依此类推。基本上,您将图像值放入一个向量中,并将该向量输入到网络中。这应该已经可以工作了。

通过首先从图像中提取特征(例如边缘),然后在这些特征上使用网络,您也许可以提高学习速度并使检测更加稳健。在这种情况下,你要做的就是整合先验知识。对于字符识别,您了解某些相关特征。因此,通过提取它们作为预处理步骤,网络不必学习这些特征。但是,如果您提供错误的(即不相关的)特征,网络将无法学习图像 -->字符映射。

The easiest solution would be to normalize all of your images, both for training and testing, to have the same resolution. Also the character in each image should be about the same size. It is also a good idea to use greyscale images, so each pixel would give you just one number. Then you could use each pixel value as one input to your network. For instance, if you have images of size 16x16 pixels, your network would have 16*16 = 256 input neurons. The first neuron would see the value of the pixel at (0,0), the second at (0,1), and so on. Basically you put the image values into one vector and feed this vector into the network. This should already work.

By first extracting features (e.g., edges) from the image and then using the network on those features, you could perhaps increase the speed of learning and also make the detection more robust. What you do in that case is incorporating prior knowledge. For character recognition you know certain relevant features. So by extracting them as a preprocessing step, the network doesn't have to learn those features. However, if you provide the wrong, i.e. irrelevant, features, the network will not be able to learn the image --> character mapping.

贩梦商人 2024-08-25 22:47:49

您要解决的问题的名称是“特征提取”。这绝对不是一件小事,也是一个积极研究的主题。

解决此问题的简单方法是将图像的每个像素映射到相应的输入神经元。显然,这仅适用于尺寸相同的图像,并且通常效果有限。

除此之外,您还可以做很多事情...Gabor 滤波器、类 Haar 特征、PCA 和 ICA、稀疏特征,仅举几个流行的示例。我的建议是选择一本关于神经网络和模式识别的教科书,或者特别是光学字符识别的教科书。

The name for the problem you're trying to solve is "feature extraction". It's decidedly non-trivial and a subject of active research.

The naive way to go about this is simply to map each pixel of the image to a corresponding input neuron. Obviously, this only works for images that are all the same size, and is generally of limited effectiveness.

Beyond this, there is a host of things you can do... Gabor filters, Haar-like features, PCA and ICA, sparse features, just to name a few popular examples. My advice would be to pick up a textbook on neural networks and pattern recognition or, specifically, optical character recognition.

失而复得 2024-08-25 22:47:49

我们的 2002 年综述论文中涵盖了将神经网络应用于图像的所有这些注意事项
(基于特征、基于像素、尺度不变性等)

您最大的挑战是所谓的“维度诅咒”。

我会将神经网络的性能与支持向量机的性能进行比较(使用哪个内核很棘手)。

All these considerations about applying NNs to images are covered in our 2002 review paper
(Feature based, pixel based, scale invariance, etc.)

Your biggest challenge is the so-called 'curse of dimensionality'.

I would compare NN-performance with that of a support vector machine (tricky which kernels to use).

听风念你 2024-08-25 22:47:49

您可以使用实际像素作为输入。这就是为什么有时最好使用较小分辨率的输入图像。

ANN 的优点在于它们能够以某种方式进行特征选择(通过为这些输入节点分配接近零的权重来忽略不重要的像素)

You can use as input the actual pixels. This is why sometimes it is preferable to use smaller resolution of the input images.

The nice thing about ANN is that they are somehow capable of feature selection (ignoring non-important pixels by assigning near-zero weights for those input nodes)

北城孤痞 2024-08-25 22:47:49

以下是一些步骤:
确保您的彩色/灰度图像是二值图像。为此,请执行一些阈值操作。接下来是某种特征提取。对于 OCR / NN 的东西这个例子可能会有所帮助,尽管在 ruby​​ 中:
https://github.com/gbuesing/neural- net-ruby/blob/master/examples/mnist.rb

Here are some steps:
make sure your color/ grey scale image is a binary image. To do this, perform some thresholding operation. following that some sort of feature extraction. For OCR / NN stuff this example might help, although in ruby :
https://github.com/gbuesing/neural-net-ruby/blob/master/examples/mnist.rb

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文