学习最佳参数以最大化奖励

发布于 2024-10-30 21:20:41 字数 353 浏览 0 评论 0原文

我有一组示例,每个示例都用特征数据进行注释。这些示例和特征描述了任意域中的实验设置(例如,切换次数、执行天数、参与者数量等)。某些功能是固定的(即静态的),而其他功能我可以在将来的实验中手动设置(即可变的)。每个例子还有一个“奖励”特征,它是一个介于 0 和 1 之间的连续数字,表示专家确定的实验成功。

基于此示例集,并给定未来实验的一组静态特征,我将如何确定用于特定变量的最佳值,以便最大化奖励?

另外,这个过程有正式的名称吗?我做了一些研究,这听起来类似于回归分析,但我仍然没有确定是否是同一件事。

I have a set of examples, which are each annotated with feature data. The examples and features describe the settings of an experiment in an arbitrary domain (e.g. number-of-switches, number-of-days-performed, number-of-participants, etc.). Certain features are fixed (i.e. static), while others I can manually set (i.e. variable) in future experiments. Each example also has a "reward" feature, which is a continuous number bounded between 0 and 1, indicating the success of the experiment as determined by an expert.

Based on this example set, and given a set of static features for a future experiment, how would I determine the optimal value to use for a specific variable so as to maximise the reward?

Also, does this process have a formal name? I've done some research, and this sounds similar to regression analysis, but I'm still not sure if it's the same thing.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

☆獨立☆ 2024-11-06 21:20:41

该过程称为“实验设计”。可以使用多种技术,具体取决于参数的数量,以及您是否能够在试验之间进行计算,或者是否必须提前选择所有治疗方法。

  • 全阶乘 - 尝试每种组合,强力方法
  • 分数阶乘 - 消除模式中的一些组合并使用回归来填充缺失的数据
  • Plackett-Burman响应面 - 更复杂的方法,用统计工作来换取实验工作
  • ......等等。这是统计研究的一个活跃领域。

一旦您根据实验中的数据构建了回归模型,您就可以通过应用常用的数值优化技术来找到最佳值。

The process is called "design of experiments." There are various techniques that can be used depending on the number of parameters, and whether you are able to do computations between trials or if you have to pick all your treatments in advance.

  • full factorial - try each combination, the brute force method
  • fractional factorial - eliminate some of the combinations in a pattern and use regression to fill in the missing data
  • Plackett-Burman, response surface - more sophisticated methods, trading off statistical effort for experimental effort
  • ...and many others. This is an active area of statistical research.

Once you've built a regression model from the data in your experiments, you can find an optimum by applying the usual numerical optimization techniques.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文