实现感知器分类器

发布于 2024-10-11 20:17:28 字数 230 浏览 16 评论 0原文

大家好，我对 Python 和 NLP 还很陌生。我需要实现一个感知器分类器。我搜索了一些网站，但没有找到足够的信息。现在我有很多文件，我根据类别（体育、娱乐等）进行分组。我还列出了这些文档中最常用的单词及其出现频率。在一个特定的网站上，有人说我必须有某种接受参数 x 和 w 的决策函数。 x 显然是某种向量（我不知道 w 是什么）。但我不知道如何使用我所拥有的信息来构建感知器算法以及如何使用它来对我的文档进行分类。你有什么想法吗？谢谢：）

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

眼眸 2024-10-18 20:17:28

感知器的外观

从外部看，感知器是一个函数，它接受 n 个参数（即 n 维向量）并生成 m 个值输出（即 m 维向量）。

在内部，感知器由神经元层组成，层中的每个神经元接收来自前一层所有神经元的输入，并使用该输入来计算单个输出。第一层由 n 个神经元组成，它接收输入。最后一层由 m 个神经元组成，在感知器处理完输入后保存输出。

如何根据输入计算输出

从神经元 i 到神经元 j 的每个连接都有一个权重 w(i,j ) （稍后我会解释它们来自哪里）。第二层神经元p的总输入是第一层神经元加权输出的总和。因此

total_input(p) = Σ(output(k) * w(k,p))

，k 运行在第一层的所有神经元上。神经元的激活是通过应用激活函数根据神经元的总输入来计算的。经常使用的激活函数是费米函数，因此

activation(p) = 1/(1-exp(-total_input(p))).

神经元的输出是通过应用输出函数根据神经元的激活来计算的。经常使用的输出函数是恒等式 f(x) = x（事实上，一些作者将输出函数视为激活函数的一部分）。我只是假设

output(p) = activation(p)

当计算第二层所有神经元的输出时，使用该输出来计算第三层的输出。迭代直到到达输出层。

权重从何而来

首先，权重是随机选择的。然后您选择一些示例（从中您知道所需的输出）。将每个示例输入感知器并计算误差，即实际输出与期望输出的差距有多大。使用该错误来更新权重。计算新权重最快的算法之一是弹性传播。

如何构建感知器

您需要解决的一些问题是

文档的相关特征是什么以及如何将它们编码为 n 维向量？
应该选择哪些例子来调整权重？
如何解释输出以对文档进行分类？示例：产生最可能的类别的单个输出与为每个类别分配概率的向量。
需要多少个隐藏层以及它们应该有多大？我建议从一个具有 n 个神经元的隐藏层开始。

第一点和第二点对于分类器的质量非常关键。感知器可能会正确地对示例进行分类，但对新文档进行分类时会失败。您可能需要进行实验。为了确定分类器的质量，选择两组示例；一种用于训练，一种用于验证。不幸的是，由于缺乏实践经验，我无法给你更详细的提示来回答这些问题。

How a perceptron looks like

From the outside, a perceptron is a function that takes n arguments (i.e an n-dimensional vector) and produces m outputs (i.e. an m-dimensional vector).

On the inside, a perceptron consists of layers of neurons, such that each neuron in a layer receives input from all neurons of the previous layer and uses that input to calculate a single output. The first layer consists of n neurons and it receives the input. The last layer consist of m neurons and holds the output after the perceptron has finished processing the input.

How the output is calculated from the input

Each connection from a neuron i to a neuron j has a weight w(i,j) (I'll explain later where they come from). The total input of a neuron p of the second layer is the sum of the weighted output of the neurons from the first layer. So

total_input(p) = Σ(output(k) * w(k,p))

where k runs over all neurons of the first layer. The activation of a neuron is calculated from the total input of the neuron by applying an activation function. An often used activation function is the Fermi function, so

activation(p) = 1/(1-exp(-total_input(p))).

The output of a neuron is calculated from the activation of the neuron by applying an output function. An often used output function is the identity f(x) = x (and indeed some authors see the output function as part of the activation function). I will just assume that

output(p) = activation(p)

When the output off all neurons of the second layer is calculated, use that output to calculate the output of the third layer. Iterate until you reach the output layer.

Where the weights come from

At first the weights are chosen randomly. Then you select some examples (from which you know the desired output). Feed each example to the perceptron and calculate the error, i.e. how far off from the desired output is the actual output. Use that error to update the weights. One of the fastest algorithms for calculating the new weights is Resilient Propagation.

How to construct a Perceptron

Some questions you need to address are

What are the relevant characteristics of the documents and how can they be encoded into an n-dimansional vector?
Which examples should be chosen to adjust the weights?
How shall the output be interpreted to classify a document? Example: A single output that yields the most likely class versus a vector that assigns probabilities to each class.
How many hidden layers are needed and how large should they be? I recommend starting with one hidden layer with n neurons.

The first and second points are very critical to the quality of the classifier. The perceptron might classify the examples correctly but fail on new documents. You will probably have to experiment. To determine the quality of the classifier, choose two sets of examples; one for training, one for validation. Unfortunately I cannot give you more detailed hints to answering these questions due to lack of practical experience.

回复收藏 0 原文

不念旧人 2024-10-18 20:17:28

我认为，当您对神经网络都不熟悉时，试图用神经网络解决 NLP 问题可能有点太过分了。您最不担心的是用一种新语言来做这件事。

我会将您链接到我教授的神经计算模块幻灯片在我的大学。您将需要第 2 周第 1 节和第 2 节的幻灯片。页面底部是如何用 C 语言实现神经网络的链接。经过一些修改应该能够将其移植到 python。您应该注意，它详细介绍了如何实现多层感知器。您只需要实现一个单层感知器，因此忽略任何涉及隐藏层的内容。

x 和 w 的快速解释。 x 和 w 都是向量。 x 是输入向量。 x 包含您关注的每个单词的标准化频率。 w 包含您关注的每个单词的权重。感知器的工作原理是将每个单词的输入频率与其各自的权重相乘，然后将它们相加。它将结果传递给一个函数（通常是 sigmoid 函数），该函数将结果转换为 0 到 1 之间的值。1 表示感知器肯定输入是它所代表的类的实例，0 表示确定输入确实不是同类的一个例子。

通过 NLP，您通常首先了解词袋模型，然后再继续学习其他更复杂的模型。希望通过神经网络，它能够学习自己的模型。这样做的问题是，神经网络不会让你对 NLP 有太多的理解，除了可以根据文档中包含的单词对文档进行分类，而且通常文档中单词的数量和类型包含了你所需要的大部分信息。需要对文档进行分类——上下文和语法不会添加太多额外的细节。

不管怎样，我希望这能为你的项目提供一个更好的起点。如果您仍然卡在某个特定部分，请再次询问，我会尽力提供帮助。