使用Python自动识别字体
您可能听说过,有一个在线字体识别服务,称为 WhatTheFont
我很好奇这个工具背后的技术。我认为基本上我们可以将其分为两部分:
从各种格式的字体文件生成图像,请参阅 http://www.fileinfo.com/filetypes/font 获取字体文件扩展名列表。
将提交的图像与所有生成的图像进行比较
我感谢您分享一些建议或 python 代码来实现上述两个步骤。
As you may have heard of, there is an online font recognition service call WhatTheFont
I'm curious about the tech behind this tool. I think basically we can seperate this into two parts:
Generate images from font files of various format, refer to http://www.fileinfo.com/filetypes/font for a list of font file extensions.
Compare submitted image with all generated images
I appreciate you share some advice or python code to implement two steps above.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
正如OP所述,有两个部分(可能还有第三部分):
使用PIL< /a> 生成从字体生成图像。
使用图像分析工具包,例如 OpenCV(具有 Python 绑定)来比较不同的形状。有多种标准技术可以比较不同的对象以查看它们是否相似。例如,尺度不变矩工作得相当好,并且是 OpenCv 工具包的一部分。
#2 中的大多数标准工具旨在查找相似但不一定相同的形状,但对于字体比较来说,这可能不是您想要的,因为字体之间的差异可能基于非常精细的细节。对于精细细节分析,请尝试比较每个字母周围周长路径的 x 和 y 轮廓,当然,要适当标准化。 (这个,或者它的一个数学上更复杂的变体,已经在字体分析中取得了很好的成功。)
As the OP states, there are two parts (and probably also a third part):
Use PIL to generate images from fonts.
Use an image analysis toolkit, like OpenCV (which has Python bindings) to compare different shapes. There are a variety of standard techniques to compare different objects to see whether they're similar. For example, scale invariant moments work fairly well and are part of the OpenCv toolkit.
Most of the standard tools in #2 are designed to look for similar but not necessarily identical shapes, but for font comparison this might not be what you want, since the differences between fonts can be based on very fine details. For fine-detail analysis, try comparing the x and y profiles of a perimeter path around the each letter, appropriately normalized, of course. (This, or a more mathematically complicated variant of it, has been used with good success in font analysis.)
我无法提供 Python 代码,但这里有两种可能的方法。
“特征字符”。在人脸识别中,给定大量归一化面部图像训练集,您可以使用主成分分析 (PCA) 来获得一组“特征脸”,当训练人脸投影到该子空间上时,这些“特征脸”表现出最大的方差。输入测试人脸相对于特征脸空间的“坐标”可以用作分类的特征向量。同样的事情也可以用文本字符来完成,即字符“A”的许多版本。
动态时间扭曲 (DTW)。该技术有时用于手写字符识别。这个想法是,对于相似的字符,铅笔尖所采取的轨迹(即 d/dx、d/dy)是相似的。 DTW 使单个人的写作实例之间的一些变化保持不变。同样,人物的轮廓可以代表轨迹。然后,该轨迹成为每个字体集的特征向量。我想 DTW 部分对于字体识别来说并不是必需的,因为是机器创建字符,而不是人类。但它对于消除空间歧义可能仍然有用。
I can't offer Python code, but here are two possible approaches.
"Eigen-characters." In face recognition, given a large training set of normalized facial images, you can use principal component analysis (PCA) to obtain a set of "eigenfaces" which, when the training faces are projected upon this subspace, exhibit the greatest variance. The "coordinates" of the input test faces with respect to the space of eigenfaces can be used as the feature vector for classification. The same thing can be done with textual characters, i.e., many versions of the character 'A'.
Dynamic Time Warping (DTW). This technique is sometimes used for handwriting character recognition. The idea is that the trajectory taken by the tip of a pencil (i.e., d/dx, d/dy) is similar for similar characters. DTW makes invariant some of the variations across instances of single person's writing. Similarly, the outline of a character can represent a trajectory. This trajectory then becomes the feature vector for each font set. I guess the DTW part is not as necessary with font recognition because a machine creates the characters, not a human. But it may still be useful to disambiguate spatial ambiguities.
这个问题有点老了,所以这里有一个更新的答案。
您应该查看这篇论文DeepFont:从图像中识别您的字体。基本上,它是一个经过大量图像训练的神经网络。它在此视频中进行了商业展示。
不幸的是,没有可用的代码。不过,此处提供了一个独立的实现。您需要自己训练它,因为没有提供权重,但代码非常容易遵循。除此之外,请考虑此实现仅适用于少数字体。
还有一个数据集的链接和一个用于生成更多数据的存储库。
希望有帮助。
This question is a little old, so here goes an updated answer.
You should take a look into this paper DeepFont: Identify Your Font from An Image. Basically it's a neural network trained on tons of images. It was presented commercially in this video.
Unfortunately, there is no code available. However, there is an independent implementation available here. You'll need to train it yourself, since weights are not provided, but the code is really easy to follow. In addition to this, consider that this implementation is only for a few fonts.
There is also a link to the dataset and a repo to generate more data.
Hope it helps.