使用遗传编程的蚁群行为
我正在研究能够利用基因编程实现食物觅食行为的进化蚂蚁,正如 Koza 此处。每个时间步长,我都会循环遍历每只蚂蚁,执行其计算机程序(蚁群中的所有蚂蚁都使用相同的程序)。目前,我已经定义了简单的指令,例如 MOVE-ONE-STEP
、TURN-LEFT
、TURN-RIGHT
等。但我也有一个按顺序执行参数的函数PROGN
。我遇到的问题是,由于 PROGN
可以按顺序执行指令,这意味着蚂蚁可以在单个时间步内执行多个操作。与自然不同,我无法并行运行蚂蚁,这意味着一只蚂蚁可能会执行多个动作,操纵环境,而所有其他蚂蚁都在等待轮到他们。
我只是想知道,这是通常的做法,还是有更好的方法? Koza似乎没有提及此事。问题是,我想扩展场景以拥有其他代理(例如敌人),这可能依赖于在单个时间步骤中仅发生一次的事情。
I'm looking at evolving ants capable of food foraging behaviour using genetic programming, as described by Koza here. Each time step, I loop through each ant, executing its computer program (the same program is used by all ants in the colony). Currently, I have defined simple instructions like MOVE-ONE-STEP
, TURN-LEFT
, TURN-RIGHT
, etc. But I also have a function PROGN
that executes arguments in sequence. The problem I am having is that because PROGN
can execute instructions in sequence, it means an ant can do multiple actions in a single time step. Unlike nature, I cannot run the ants in parallel, meaning one ant might go and perform several actions, manipulating the environment whilst all of the other ants are waiting to have their turn.
I'm just wondering, is this how it is normally done, or is there a better way? Koza does not seem to mention anything about it. Thing is, I want to expand the scenario to have other agents (e.g. enemies), which might rely on things occurring only once in a single time step.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我不熟悉 Koza 的工作,但我认为合理的方法是为每只蚂蚁提供自己的指令队列,该队列在时间步长内持续存在。通过这样做,您可以让蚂蚁在每个时间步执行一条指令的
PROGN
函数。例如,蚂蚁时间步长的高级逻辑可以是:I am not familiar with Koza's work, but I think a reasonable approach is to give each ant its own instruction queue that persists across time steps. By doing this, you can get the ants to execute
PROGN
functions one instruction per time step. For instance, the high-level logic for the time step of an ant can be:另一种类似的指令排队方法是预处理指令集并将 PROGN 实例扩展为组件指令集。如果您允许 PROGN 调用其他 PROGN,则必须递归地完成此操作。这样做的缺点是候选程序有点臃肿,但这只是在运行时。另一方面,它简单、快速并且非常容易调试。
例子:
假设 PROGN1 = {inst-p1 inst-p2}
那么候选程序将以 {inst1 PROGN1 inst2} 开始,并在准备好在模拟中进行评估时扩展到 {inst1 inst-p1 inst-p2 inst2}。
Another similar approach to queuing instructions would be to preprocess the set of instructions an expand instances of PROGN to the set of component instructions. This would have to be done recursively if you allow PROGNs to invoke other PROGNs. The downside to this is that the candidate programs get a bit bloated, but this is only at runtime. On the other hand, it is easy, quick, and pretty easy to debug.
Example:
Say PROGN1 = {inst-p1 inst-p2}
Then the candidate program would start off as {inst1 PROGN1 inst2} and would be expanded to {inst1 inst-p1 inst-p2 inst2} when it was ready to be evaluated in simulation.
这完全取决于您特定的 GP 实施。
在我的 GP 内核程序中,作为一个整体,要么重复评估,要么并行评估,即这种情况下的“原子”操作是单个程序评估。
因此,在评估下一个程序之前,群体中的所有个体都会按顺序重复 n 次,或者所有个体只执行一次,然后再执行 n 次。
我使用这种并发级别的虚拟代理取得了相当不错的结果。
绝对可以将其进一步分解,但是此时您将降低算法的可扩展性:
虽然很容易将程序的评估分布在多个 CPU 或内核之间,但做同样的事情几乎毫无价值仅由于所有程序之间所需的同步量而进行每个节点评估。
鉴于现代系统(甚至智能手机)中 CPU/内核数量的快速增加以及 GP 的“CPU 饥饿”,您可能需要重新考虑您的方法 - 您真的想在程序中包含移动/转动指令吗?
为什么不重新设计它以使用在程序评估期间在某些寄存器/变量中存储方向和速度参数的原语?
然后,模拟步骤根据程序存储的指令,采用这些参数来实际移动/转动您的代理。
干杯,
杰伊
It all depends on your particular GP implementation.
In my GP kernel programs are either evaluated repeatedly or in parallel - as a whole, i.e. the 'atomic' operation in this scenario is a single program evaluation.
So all individuals in the population are repeated n times sequentially before evaluating the next program or all individuals are executed just once, then again for n times.
I've had pretty nice results with virtual agents using this level of concurrency.
It is definitely possible to break it down even more, however at that point you'll reduce the scalability of your algorithm:
While it is easy to distribute the evaluation of programs amongst several CPUs or cores it'll be next to worthless doing the same with per-node evaluation just due to the amount of synchronization required between all programs.
Given the rapidly increasing number of CPUs/cores in modern systems (even smartphones) and the 'CPU-hunger' of GP you might want to rethink your approach - do you really want to include move/turn instructions in your programs?
Why not redesign it to use primitives that store away direction and speed parameters in some registers/variables during program evaluation?
The simulation step then takes these parameters to actually move/turn your agents based on the instructions stored away by the programs.
Cheers,
Jay