了解多层感知器网络

发布于 2024-08-30 08:10:22 字数 200 浏览 4 评论 0原文

我试图了解如何训练多层;然而,我在弄清楚如何确定合适的网络架构(即网络每层中的节点/神经元数量)时遇到了一些麻烦。

对于特定任务,我有四个输入源,每个输入源可以输入三种状态之一。我猜这意味着四个输入神经元会触发 0、1 或 2,但据我所知,输入应该保持二进制?

此外,我在选择隐藏层中的神经元数量时遇到一些问题。任何评论都会很棒。

谢谢。

I'm trying to understand how to train a multilayer; however, I'm having some trouble figuring out how to determine a suitable network architecture--i.e., number of nodes/neurons in each layer of the network.

For a specific task, I have four input sources that can each input one of three states. I guess that would mean four input neurons firing either 0, 1 or 2, but as far as I'm told, input should be kept binary?

Furthermore am I having some issues choosing on the amount of neurons in the hidden layer. Any comments would be great.

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

烟火散人牵绊 2024-09-06 08:10:22

确定多层感知器可接受的网络结构实际上很简单。

  1. 输入层:有多少个特征/维度
    你的数据——即有多少列
    每个数据行。对此加一(对于
    偏置节点),这就是
    第一个的节点数(输入
    层)。

  2. 输出层:您的 MLP 是否在“机器”中运行
    模式或“回归”模式
    (这里使用的“回归”
    机器学习而不是
    统计意义)——即,我的 MLP 是否
    返回类标签或预测
    价值?如果是后者,那么你的
    输出层有一个节点。如果
    前者,然后是你的输出层
    具有相同数量的节点
    类标签。例如,如果
    你想要的结果是标记每个
    例如“欺诈”或“不
    欺诈”,这是两个类别标签,
    因此,输出中有两个节点

  3. 隐藏层:在这两者之间(输入和
    输出)显然是隐藏的
    层。始终从单个开始
    隐藏层。那么H\有多少个节点?这是一条经验法则:将隐藏层的(初始)大小设置为略大于输入层中的节点数的节点数。与少于输入层的节点相比,这种多余的容量将有助于您的数值优化例程(例如梯度下降)收敛。

总之,从网络架构的三层开始;第一个(输入)和最后一个(输出)的大小分别由您的数据和模型设计固定。隐藏层略大于输入层几乎总是一个好的开始设计。

所以在您的情况下,合适的网络结构如下:

输入层:5个节点-->隐藏层:7个节点-->输出层 :3个节点

Determining an acceptable Network structure for a multi-layer perceptron is actually straightforward.

  1. Input Layer: How many features/dimensions are in
    your data--ie, how many columns in
    each data row. Add one to this (for
    the bias node) and that is the
    number of nodes for the first (input
    layer).

  2. Output Layer: Is your MLP running in 'machine'
    mode or 'regression' mode
    ('regression' used here in the
    machine learning rather than the
    statistical sense)--ie, does my MLP
    return a class label or a predicted
    value? If the latter, then your
    output layer has a single node. If
    the former, then your output layer
    has the same number of nodes as
    class labels. For instance, if the
    result you want is to label each
    instance as either "fraud", or "not
    fraud", that's two class labels,
    therefore, two nodes in your output
    layer.

  3. Hidden Layer(s): In between these two (input and
    output) are obviously the hidden
    layers. Always start with a single
    hidden layer. So H\how many nodes? Here's a rule of thumb: set the (initial) size of the hidden layer to some number of nodes just slightly greater than the number of nodes in the input layer. Compared with having fewer nodes than the input layer, this excess capacity will help your numerical optimization routine (eg, gradient descent) converge.

In sum, begin with three layers for your network architecture; the sizes of the first (input) and last (output are fixed by your data, and by your model design, respectively. A hidden layer just slightly larger than the input layer is nearly always a good design to begin.

So in your case, a suitable network structure to begin would be:

input layer: 5 nodes --> hidden layer: 7 nodes --> output layer: 3 nodes

唔猫 2024-09-06 08:10:22

我在几点上不同意上面道格的回答。

您有 4 个离散(3 路分类)输入。您应该(除非您有充分的理由不这样做)将其表示为 12 个二进制输入,对四个概念输入中的每一个输入使用 3 中的 1 编码。
因此,如果您输入是 [2,0,1,1] 那么您的网络应该给出:
0 0 1 1 0 0 0 1 0 0 1 0
如果您的网络实现需要手动偏置,那么您应该为偏置添加另一个始终开启的位,但大多数明智的神经网络实现不需要这样做。

尝试一些不同数量的隐藏单元。您不需要将自己限制为小于输入层大小的隐藏层大小,但如果将其设置得更大,则应小心调整权重,也许使用 L2 或 L1 权重衰减,甚至可能还进行早期停止训练中(当验证集上的错误停止改善时停止训练)。

I disagree with doug's answer above on a few points.

You have 4 discrete (3 way categorical) inputs. You should (unless you have a strong reason not to) represent that as 12 binary inputs using a 1-of-3 encoding for each of your four conceptual inputs.
So if you input is [2,0,1,1] then your network should be given:
0 0 1 1 0 0 0 1 0 0 1 0
If your network implementation requires a manual bias, then you should add another always on bit for the bias, but most sensible neural net implementations don't require that.

Try a few different numbers of hidden units. You don't need to restrict yourself to a hidden layer size smaller than the input layer size, but if you make it larger you should be careful to regularlize your weights, perhaps with L2 or L1 weight decay and maybe even also doing early-stopping in training (stop training when your error on a held out validation set stops improving).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文