神经网络参数中的jax空元素

发布于 2025-01-21 23:59:35 字数 649 浏览 5 评论 0原文

我正在努力实施一个非常小的神经网络。 My network is as follows:

init_random_params, predict = stax.serial(
Dense(1024), Relu,
Dense(1024), Relu,
Dense(10), LogSoftmax)

I have initialized the neural network as follows:

_, init_params = init_random_params(rng, (-1, 28 * 28))
params = init_params

when I print the parameters of neural network it is a tuple of size 6. In fact param[0] and param[ 2]param [4]是非空的,由相应层的重量和偏置组成。但是,其余元素,即,param [1]param [3]param [5]简直是空。

我想了解这背后的原因,因为这使我很难实施一些我感兴趣的培训算法。

I am working on the implementation of a very small neural network. My network is as follows:

init_random_params, predict = stax.serial(
Dense(1024), Relu,
Dense(1024), Relu,
Dense(10), LogSoftmax)

I have initialized the neural network as follows:

_, init_params = init_random_params(rng, (-1, 28 * 28))
params = init_params

when I print the parameters of neural network it is a tuple of size 6. In fact param[0] and param[2] and param[4] are non-empty and consists of weight and bias of the corresponding layers. However, the rest of elements, i.e., param[1] and param[3] and param[5] are simply empty.

I would like to understand the reason behind this because it makes it difficult for me to implement some training algorithms that I am interested.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

能否归途做我良人 2025-01-28 23:59:35

您是否使用stax有什么特殊原因?引用stax ::

您可能并不意味着要导入此模块! Stax仅作为示例库。 JAX还有许多其他功能齐全的神经网络库,包括 flax 来自Google,和 haiku 来自deepmind。

如果您在使用STAX创建神经网络时遇到困难,则可以尝试使用积极支持的神经网络库。

也就是说,我相信这些条目中没有参数的原因是relulogSoftMax不需要任何参数。

Is there any particular reason you're using stax? Quoting from the stax module docstring:

You likely do not mean to import this module! Stax is intended as an example library only. There are a number of other much more fully-featured neural network libraries for JAX, including Flax from Google, and Haiku from DeepMind.

If you're having trouble creating neural networks with stax, you might try using an actively supported neural network library instead.

That said, I believe the reason there are no parameters in these entries is because ReLu and LogSoftmax do not require any parameters.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文