当前位置：文江博客话题详情

逐符号手写识别有哪些算法？

发布于 2024-12-18 16:04:15 字数 55 浏览 4 评论 0原文

我认为有一些算法可以评估绘制的符号和预期符号之间的差异，或者类似的东西。任何帮助将不胜感激:))

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

黎歌 2024-12-25 16:04:16

附录

如果您之前没有实现过机器学习算法，那么您应该真正检查一下：www.ml-class.org

这是一门免费课程，由斯坦福大学机器学习中心主任 Andrew Ng 教授。该课程是一门完全在线授课的课程，专门针对实施各种机器学习算法。它不会过多讨论算法的复杂理论，而是教您如何选择、实现、使用算法以及如何诊断其性能。 - 它的独特之处在于自动检查您的算法实施！这对于您开始机器学习非常有用，因为您可以获得即时反馈。

该课程还包括至少两个识别手写数字的练习。（编程练习 3：多项式分类和编程练习 4：前馈神经网络）

课程已经开始一段时间了，但应该仍然可以报名。如果没有，新一轮的运行应该在明年初开始。如果您希望能够检查您的实现，您需要注册“高级轨道”。

实现手写识别的一种方法

这个问题的答案取决于很多因素，包括你有什么样的资源限制（嵌入式平台）以及你是否有一个好的正确标记的符号库：即手写字母的不同示例，您知道它们代表什么字母。

如果您有一个规模合适的库，那么实现快速而肮脏的标准机器学习算法可能是最佳选择。您可以使用多项分类器、神经网络或支持向量机。

我相信支持向量机的实现速度最快，因为有优秀的库可以为您处理代码的机器学习部分，例如 libSVM。如果您熟悉机器学习算法的使用，那么实施起来应该只需不到 30 分钟。

您可能想要实现的基本过程如下：

了解符号“看起来像什么”

对库中的图像进行二值化。
将图像展开为向量/一维数组。
将库中图像的“矢量表示”及其标签传递给 libSVM，以使其了解像素覆盖范围与库中图像的表示符号的关系。
该算法会返回一组模型参数，这些参数描述了所学习的识别算法。

您应该对要识别的每个字符重复 1-4，以获得一组适当的模型参数。

注意：步骤 1-4 您只需为您的库执行一次（但为您想要识别的每个符号执行一次）。您可以在开发人员计算机上执行此操作，并且仅在您发布/分发的代码中包含参数。

如果你想识别一个符号：

每组模型参数都描述了一种算法，该算法测试一个字符是否代表一个特定字符。您可以通过使用当前符号测试所有模型，然后选择最适合您正在测试的符号的模型来“识别”字符。

该测试是通过再次将模型参数和符号以展开形式传递到 SVM 库来完成的，该库将返回测试模型的拟合优度。

Addendum

If you have not implemented machine learning algorithms before you should really check out: www.ml-class.org

It's a free class taught by Andrew Ng, Director of the Stanford Machine Learning Centre. The course is an entirely online-taught course specifically on implementing a wide range of machine learning algorithms. It does not go too much into the theoretical intricacies of the algorithms but rather teaches you how to choose, implement, use the algorithms and how diagnose their performance. - It is unique in that your implementation of the algorithms is checked automatically! It's great for getting started in machine learning at you have instantaneous feedback.

The class also includes at least two exercises on recognising handwritten digits. (Programming Exercise 3: with multinomial classification and Programming Exercise 4: with feed-forward neural networks)

The class has started a while ago but it should still be possible to sign up. If not, a new run should start early next year. If you want to be able to check your implementations you need to sign up for the "Advanced Track".

One way to implement handwriting recognition

The answer to this question depends on a number of factors, including what kind of resource constraints you have (embedded platform) and whether you have a good library of correctly labelled symbols: i.e. different examples of a handwritten letter for which you know what letter they represent.

If you have a decent sized library, implementation of a quick and dirty standard machine learning algorithm is probably the way to go. You can use multinomial classifiers, neural networks or support vector machines.

I believe a support vector machine would be fastest to implement as there are excellent libraries out there who handle the machine learning portion of the code for you, e.g. libSVM. If you are familiar with using machine learning algorihms, this should take you less than 30 minutes to implement.

The basic procedure you would probably want to implement is as follows:

Learning what symbols "look like"

Binarise the images in your library.
Unroll the images into vectors / 1-D arrays.
Pass the "vector representation" of the images in your library and their labels to libSVM to get it to learn how the pixel coverage relates to the represented symbol for the images in the library.
The algorithm gives you back a set of model parameters which describe the recognition algorithm that was learned.

You should repeat 1-4 for each character you want to recognise to get an appropriate set of model parameters.

Note: steps 1-4 you only have to carry out once for your library (but once for each symbol you want to recognise). You can do this on your developer machine and only include the parameters in the code you ship / distribute.

If you want to recognise a symbol:

Each set of model parameters describes an algorithm which tests whether a character represents one specific character - or not. You "recognise" a character by testing all the models with the current symbol and then selecting the model that best fits the symbol you are testing.

This testing is done by again passing the model parameters and the symbol to test in unrolled form to the SVM library which will return the goodness-of-fit for the tested model.

回复收藏 0 原文

明月夜 2024-12-25 16:04:15

您可以实现一个简单的神经网络来识别手写数字。最简单的实现类型是通过反向传播训练的前馈网络（可以随机或批量模式训练）。您可以对反向传播算法进行一些改进，以帮助您的神经网络更快地学习（动量、Silva 和 Almeida 算法、模拟退火）。

至于查看真实符号和预期图像之间的差异，我见过使用的一种算法是 k-最近邻算法。这里是一篇描述使用k-nearest的论文- 用于字符识别的邻居算法（编辑：我之前有错误的链接。我提供的链接要求您支付论文费用；我正在尝试找到该版本的免费版本纸）。

如果您使用神经网络来识别角色，则涉及的步骤是：

使用适当的训练算法设计神经网络。我建议从最简单的（随机反向传播）开始，然后在训练网络时根据需要改进算法。
获取良好的训练数据样本。对于识别手写数字的神经网络，我使用了 MNIST 数据库。
将训练数据转换为神经网络的输入向量。对于 MNIST 数据，您需要对图像进行二值化。我使用的阈值是 128。我从 Otsu 方法开始，但这并没有给我结果通缉。
创建你的网络。由于 MNIST 的图像采用 28x28 的数组形式，因此您的神经网络有一个包含 784 个分量和 1 个偏差（即 785 个输入）的输入向量。我使用了一个隐藏层，其节点数量按照此处概述的指南（以及偏差）设置。您的输出向量将有 10 个分量（每个数字一个）。
向网络随机提供训练数据（即随机排序的数字，每个数字都有随机输入图像）并对其进行训练，直到达到所需的错误级别。
针对您的神经网络运行测试数据（MNIST 数据也随之提供），以验证它是否正确识别数字。

您可以在此处查看一个尝试识别手写数字的示例（无耻插件）。我使用 MNIST 的数据训练网络。

如果您决定走这条路，那么需要花一些时间让自己快速了解神经网络概念。我至少花了 3-4 天的时间来阅读和编写代码，才真正理解了这个概念。 heatonresearch.com 是一个很好的资源。我建议首先尝试实现神经网络来模拟 AND、OR 和 XOR 布尔运算（使用阈值激活函数）。这应该能让您了解基本概念。当它实际上归结为训练你的网络时，你可以尝试训练一个识别 XOR 布尔运算符的神经网络；这是介绍学习算法的一个很好的起点。

在构建神经网络时，您可以使用现有的框架，例如 Encog，但我发现构建起来更令人满意我自己的网络（我认为你可以通过这种方式学到更多）。如果你想查看一些源代码，你可以查看我在 github 上的一个项目（无耻插件），它有一些 Java 中的基本类，可以帮助您构建和训练简单的神经网络。

祝你好运！

编辑

我发现了一些使用k-nearest-neighbors进行数字和/或字符识别的来源：

使用数字曲波变换进行孟加拉语基本字符识别（
原始图像及其形态改变的版本用于训练单独的 k–
最近邻分类器。这些分类器的输出值使用简单的融合
多数投票方案得出最终决定。)
最近邻和相似性搜索主页
在大型数据库上使用近似最近邻搜索进行快速准确的手写字符识别
最近邻检索和分类

对于有关神经网络的资源，我发现以下链接很有用：

CS-449：神经网络
人工神经网络：神经网络教程
神经网络简介
神经网络Java
反向传播神经网络简介
动量和学习率适应（此页面介绍了标准反向传播算法的一些增强功能，可以提高学习速度）

You can implement a simple Neural Network to recognize handwritten digits. The simplest type to implement is a feed-forward network trained via backpropagation (it can be trained stochastically or in batch-mode). There are a few improvements that you can make to the backpropagation algorithm that will help your neural network learn faster (momentum, Silva and Almeida's algorithm, simulated annealing).

As far as looking at the difference between a real symbol and an expected image, one algorithm that I've seen used is the k-nearest-neighbor algorithm. Here is a paper that describes using the k-nearest-neighbor algorithm for character recognition (edit: I had the wrong link earlier. The link I've provided requires you to pay for the paper; I'm trying to find a free version of the paper).

If you were using a neural network to recognize your characters, the steps involved would be:

Design your neural network with an appropriate training algorithm. I suggest starting with the simplest (stochastic backpropagation) and then improving the algorithm as desired, while you train your network.
Get a good sample of training data. For my neural network, which recognizes handwritten digits, I used the MNIST database.
Convert the training data into an input vector for your neural network. For the MNIST data, you will need to binarize the images. I used a threshold value of 128. I started with Otsu's method, but that didn't give me the results I wanted.
Create your network. Since the images from MNIST come in an array of 28x28, you have an input vector with 784 components and 1 bias (so 785 inputs), to your neural network. I used one hidden layer with the number of nodes set as per the guidelines outlined here (along with a bias). Your output vector will have 10 components (one for each digit).
Randomly present training data (so randomly ordered digits, with random input image for each digit) to your network and train it until it reaches a desired error-level.
Run test data (MNIST data comes with this as well) against your neural network to verify that it recognizes digits correctly.

You can check out an example here (shameless plug) that tries to recognize handwritten digits. I trained the network using data from MNIST.

Expect to spend some time getting yourself up to speed on neural network concepts, if you decide to go this route. It took me at least 3-4 days of reading and writing code before I actually understood the concept. A good resource is heatonresearch.com. I recommend starting with trying to implement neural networks to simulate the AND, OR, and XOR boolean operations (using a threshold activation function). This should give you an idea of the basic concepts. When it actually comes down to training your network, you can try to train a neural network that recognizes the XOR boolean operator; it's a good place to start for an introduction to learning algorithms.

When it comes to building the neural network, you can use existing frameworks like Encog, but I found it to be far more satisfactory to build the network myself (you learn more that way I think). If you want to look at some source, you can check out a project that I have on github (shameless plug) that has some basic classes in Java that help you build and train simple neural-networks.

Good luck!

EDIT

I've found a few sources that use k-nearest-neighbors for digit and/or character recognition: