实现感知器分类器

发布于 2024-10-11 20:17:28 字数 230 浏览 8 评论 0原文

大家好,我对 Python 和 NLP 还很陌生。我需要实现一个感知器分类器。我搜索了一些网站,但没有找到足够的信息。现在我有很多文件,我根据类别(体育、娱乐等)进行分组。我还列出了这些文档中最常用的单词及其出现频率。在一个特定的网站上,有人说我必须有某种接受参数 x 和 w 的决策函数。 x 显然是某种向量(我不知道 w 是什么)。但我不知道如何使用我所拥有的信息来构建感知器算法以及如何使用它来对我的文档进行分类。你有什么想法吗?谢谢 :)

Hi I'm pretty new to Python and to NLP. I need to implement a perceptron classifier. I searched through some websites but didn't find enough information. For now I have a number of documents which I grouped according to category(sports, entertainment etc). I also have a list of the most used words in these documents along with their frequencies. On a particular website there was stated that I must have some sort of a decision function accepting arguments x and w. x apparently is some sort of vector ( i dont know what w is). But I dont know how to use the information I have to build the perceptron algorithm and how to use it to classify my documents. Have you got any ideas? Thanks :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

眼眸 2024-10-18 20:17:28

感知器的外观

从外部看,感知器是一个函数,它接受 n 个参数(即 n 维向量)并生成 m 个值输出(即 m 维向量)。

在内部,感知器由神经元层组成,层中的每个神经元接收来自前一层所有神经元的输入,并使用该输入来计算单个输出。第一层由 n 个神经元组成,它接收输入。最后一层由 m 个神经元组成,在感知器处理完输入后保存输出。

如何根据输入计算输出

从神经元 i 到神经元 j 的每个连接都有一个 权重 w(i,j ) (稍后我会解释它们来自哪里)。第二层神经元p总输入是第一层神经元加权输出的总和。因此

total_input(p) = Σ(output(k) * w(k,p))

k 运行在第一层的所有神经元上。神经元的激活是通过应用激活函数根据神经元的总输入来计算的。经常使用的激活函数是费米函数,因此

activation(p) = 1/(1-exp(-total_input(p))).

神经元的输出是通过应用输出函数根据神经元的激活来计算的。经常使用的输出函数是恒等式 f(x) = x(事实上,一些作者将输出函数视为激活函数的一部分)。我只是假设

output(p) = activation(p)

当计算第二层所有神经元的输出时,使用该输出来计算第三层的输出。迭代直到到达输出层。

权重从何而来

首先,权重是随机选择的。然后您选择一些示例(从中您知道所需的输出)。将每个示例输入感知器并计算误差,即实际输出与期望输出的差距有多大。使用该错误来更新权重。计算新权重最快的算法之一是弹性传播

如何构建感知器

您需要解决的一些问题是

  1. 文档的相关特征是什么以及如何将它们编码为 n 维向量?
  2. 应该选择哪些例子来调整权重?
  3. 如何解释输出以对文档进行分类?示例:产生最可能的类别的单个输出与为每个类别分配概率的向量。
  4. 需要多少个隐藏层以及它们应该有多大?我建议从一个具有 n 个神经元的隐藏层开始。

第一点和第二点对于分类器的质量非常关键。感知器可能会正确地对示例进行分类,但对新文档进行分类时会失败。您可能需要进行实验。为了确定分类器的质量,选择两组示例;一种用于训练,一种用于验证。不幸的是,由于缺乏实践经验,我无法给你更详细的提示来回答这些问题。

How a perceptron looks like

From the outside, a perceptron is a function that takes n arguments (i.e an n-dimensional vector) and produces m outputs (i.e. an m-dimensional vector).

On the inside, a perceptron consists of layers of neurons, such that each neuron in a layer receives input from all neurons of the previous layer and uses that input to calculate a single output. The first layer consists of n neurons and it receives the input. The last layer consist of m neurons and holds the output after the perceptron has finished processing the input.

How the output is calculated from the input

Each connection from a neuron i to a neuron j has a weight w(i,j) (I'll explain later where they come from). The total input of a neuron p of the second layer is the sum of the weighted output of the neurons from the first layer. So

total_input(p) = Σ(output(k) * w(k,p))

where k runs over all neurons of the first layer. The activation of a neuron is calculated from the total input of the neuron by applying an activation function. An often used activation function is the Fermi function, so

activation(p) = 1/(1-exp(-total_input(p))).

The output of a neuron is calculated from the activation of the neuron by applying an output function. An often used output function is the identity f(x) = x (and indeed some authors see the output function as part of the activation function). I will just assume that

output(p) = activation(p)

When the output off all neurons of the second layer is calculated, use that output to calculate the output of the third layer. Iterate until you reach the output layer.

Where the weights come from

At first the weights are chosen randomly. Then you select some examples (from which you know the desired output). Feed each example to the perceptron and calculate the error, i.e. how far off from the desired output is the actual output. Use that error to update the weights. One of the fastest algorithms for calculating the new weights is Resilient Propagation.

How to construct a Perceptron

Some questions you need to address are

  1. What are the relevant characteristics of the documents and how can they be encoded into an n-dimansional vector?
  2. Which examples should be chosen to adjust the weights?
  3. How shall the output be interpreted to classify a document? Example: A single output that yields the most likely class versus a vector that assigns probabilities to each class.
  4. How many hidden layers are needed and how large should they be? I recommend starting with one hidden layer with n neurons.

The first and second points are very critical to the quality of the classifier. The perceptron might classify the examples correctly but fail on new documents. You will probably have to experiment. To determine the quality of the classifier, choose two sets of examples; one for training, one for validation. Unfortunately I cannot give you more detailed hints to answering these questions due to lack of practical experience.

不念旧人 2024-10-18 20:17:28

我认为,当您对神经网络都不熟悉时,试图用神经网络解决 NLP 问题可能有点太过分了。您最不担心的是用一种新语言来做这件事。

我会将您链接到我教授的神经计算模块幻灯片在我的大学。您将需要第 2 周第 1 节和第 2 节的幻灯片。页面底部是如何用 C 语言实现神经网络的链接。经过一些修改应该能够将其移植到 python。您应该注意,它详细介绍了如何实现多层感知器。您只需要实现一个单层感知器,因此忽略任何涉及隐藏层的内容。

xw 的快速解释。 x 和 w 都是向量。 x 是输入向量。 x 包含您关注的每个单词的标准化频率。 w 包含您关注的每个单词的权重。感知器的工作原理是将每个单词的输入频率与其各自的权重相乘,然后将它们相加。它将结果传递给一个函数(通常是 sigmoid 函数),该函数将结果转换为 0 到 1 之间的值。1 表示感知器肯定输入是它所代表的类的实例,0 表示确定输入确实不是同类的一个例子。

通过 NLP,您通常首先了解词袋模型,然后再继续学习其他更复杂的模型。希望通过神经网络,它能够学习自己的模型。这样做的问题是,神经网络不会让你对 NLP 有太多的理解,除了可以根据文档中包含的单词对文档进行分类,而且通常文档中单词的数量和类型包含了你所需要的大部分信息。需要对文档进行分类——上下文和语法不会添加太多额外的细节。

不管怎样,我希望这能为你的项目提供一个更好的起点。如果您仍然卡在某个特定部分,请再次询问,我会尽力提供帮助。

I think that trying to solve an NLP problem with a Neural Network when you're not familiar with either might be a step too far. That you're doing it in a new language is the least of your worries.

I'll link you to my Neural Computation module slides that gets taught at my university. You'll want the slides from session 1 and session 2 in week 2. Right at the bottom of the page is a link to how to implement a neural network in C. With a few modifications should be able to port it to python. You should note that it details how to implement a multilayer perceptron. You only need to implement a single layer perceptron, so ignore anything that talks about hidden layers.

A quick explanation of x and w. Both x and w are vectors. x is the input vector. x contains normalised frequencies for each word you are concerned about. w contains weights for each word you are concerned with. The perceptron works by multiplying the input frequency for each word by its respective weight and summing them up. It passes the result to a function (typically a sigmoid function) that turns the result into a value between 0 and 1. 1 means the perceptron is positive that the inputs are an instance of the class it represents and 0 means it is sure that the inputs really aren't an example of its class.

With NLP you typically learn about the bag of words model first, before moving on to other, more complex, models. With a neural network, hopefully, it will learn its own model. The problem with this is that the neural network will not give you much of an understanding of NLP, other than documents can be classified by the words they contain, and that usually the number and type of words in a document contains most of the information you need to classify a document -- context and grammar do not add much extra detail.

Anyway, I hope that gives a better place from which to start your project. If you're still stuck on a particular part then ask again and I'll do my best to help.

蛮可爱 2024-10-18 20:17:28

您应该看看 Frabizio 撰写的这篇关于文本分类的调查论文塞巴斯蒂亚尼.它告诉您进行文本分类的所有最佳方法。

现在,我不会打扰您阅读整篇文章,但在最后有一张表格,他在其中比较了许多不同人的技术如何在许多不同的测试语料库上叠加。找到它,选择最好的一个(最好的感知器,如果你的任务是专门学习如何使用感知器做到这一点),然后阅读他引用的详细描述该方法的论文。

您现在知道如何构建良好的主题文本分类器。

将奥斯瓦尔德给您的算法(以及您在其他问题中发布的算法)转换为代码是编程小事 (TM)。如果您在工作时遇到 TF-IDF 等不熟悉的术语,请请老师通过解释这些术语来帮助您。

You should take a look at this survey paper on text classification by Frabizio Sebastiani. It tells you all of the best ways to do text classification.

Now, I'm not going to bother you to read the whole thing, but there's one table near the end, where he compares how lots of different people's techniques stack up on lots of different test corpora. Find it, pick the best one (the best perceptron one, if you assignment is specifically to learn how to do this with perceptron), and read the paper he cites that describes that method in detail.

You now know how to construct a good topical text classifier.

Turning the algorithm that Oswald gave you (and that you posted in your other question) into code is a Small Matter of Programming (TM). And if you encounter unfamiliar terms like TF-IDF while you're working, ask your teacher to help you by explaining those terms.

孤单情人 2024-10-18 20:17:28

多层感知器(用于一般分类问题的特定 NeuralNet 架构。)现在可从 GraphLab 人员处获得适用于 Python 的版本:

https://dato.com/products/create/docs/ generated/graphlab.deeplearning.MultiLayerPerceptrons.html#graphlab.deeplearning.MultiLayerPerceptrons

MultiLayer perceptrons (A specific NeuralNet architecture for general classification problem.) Now available for Python from the GraphLab folks:

https://dato.com/products/create/docs/generated/graphlab.deeplearning.MultiLayerPerceptrons.html#graphlab.deeplearning.MultiLayerPerceptrons

绿萝 2024-10-18 20:17:28

前几天我尝试过实施类似的东西。我编写了一些代码来识别英文文本和非英文文本。我已经很多年没有做过人工智能或统计了,所以这有点像一次尝试。

我的代码在这里(不想让帖子膨胀):http://cnippit。 com/content/perceptron-statistically-recognizing-english

输入:

  • 我获取一个文本文件,将其分成
    三元组(例如“abcdef”=> [“abc”,
    “bcd”、“cde”、“def”])
  • 我计算每个的相对频率,并将其作为感知器的输入(因此有 26^3 个输入)

尽管我并不真正知道我在做什么,但它似乎工作得相当好。不过,成功在很大程度上取决于训练数据。在我用更多的法语/西班牙语/德语文本等对其进行训练之前,我的结果很差

。不过,这是一个非常小的例子,在值上有很多“幸运的猜测”(例如初始权重、偏差、阈值等)。

多个课程:
如果您想要区分多个类(即不像“是 A 或非 A”那么简单),那么一种方法是为每个类使用感知器。例如。一个用于体育,一个用于新闻等。

根据分组为体育或非体育的数据训练体育感知器。与新闻或非新闻等类似。对

新数据进行分类时,您将输入传递给所有感知器,无论哪个感知器返回 true(或“触发”),那么这就是数据所属的类。

我早在大学时就使用过这种方法,当时我们使用一组感知器来识别手写字符。它很简单,而且工作非常有效(如果我没记错的话,准确率>98%)。

I had a try at implementing something similar the other day. I made some code to recognize english looking text vs non-english. I hadn't done AI or statistics in many years, so it was a bit of a shotgun attempt.

My code is here (don't want to bloat the post): http://cnippit.com/content/perceptron-statistically-recognizing-english

Inputs:

  • I take a text file, split it up into
    tri-grams (eg "abcdef" => ["abc",
    "bcd", "cde", "def"])
  • I calculate the relative frequencies of each, and feed that as the inputs to the perceptron (so there are 26^3 inputs)

Despite me not really knowing what I was doing, it seems to work fairly well. The success depends quite heavily on the training data though. I was getting poor results until I trained it on more french/spanish/german text etc.

It's a very small example though, with lots of "lucky guesses" at values (eg. initial weights, bias, threshold, etc.).

Multiple classes:
If you have multiple classes you want to distinquish between (ie. not as simple as "is A or NOT-A"), then one approach is to use a perceptron for each class. Eg. one for sport, one for news, etc.

Train the sport-perceptron on data grouped as either sport or NOT-sport. Similar for news or Not-news, etc.

When classifying new data, you pass your input to all perceptrons, and whichever one returns true (or "fires"), then that's the class the data belongs to.

I used this approach way back in university, where we used a set of perceptrons for recognizing handwritten characters. It's simple and worked pretty effectively (>98% accuracy if I recall correctly).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文