C++ 中遗传算法的最佳数据结构?

发布于 2024-07-26 22:58:51 字数 390 浏览 5 评论 0原文

我需要实现一个针对我的问题(大学项目)定制的遗传算法,第一个版本将其编码为短矩阵(每条染色体的位数x人口大小)。

这是一个糟糕的设计,因为我声明了一个短路,但仅使用“0”和“1”值......但这只是一个原型,并且按预期工作,现在是时候开发一个新的了, 改良版。 性能在这里很重要,但简单性也很受赞赏。

我研究了一下并想出了:

对于染色体: - 字符串类(如“0100100010”) - 布尔数组 - 向量(向量似乎针对布尔进行了优化) - Bitset(听起来是最自然的)

和人群: - C 数组[] - 矢量 - 队列

我倾向于选择染色体向量和流行数组,但我想听听任何对此主题有经验的人的意见。

提前致谢!

i need to implement a genetic algorithm customized for my problem (college project), and the first version had it coded as an matrix of short ( bits per chromosome x size of population).

That was a bad design, since i am declaring a short but only using the "0" and "1" values... but it was just a prototype and it worked as intended, and now it is time for me to develop a new, improved version. Performance is important here, but simplicity is also appreciated.

I researched around and came up with:

for the chromosome :
- String class (like "0100100010")
- Array of bool
- Vector (vectors appears to be optimized for bool)
- Bitset (sounds the most natural one)

and for the population:
- C Array[]
- Vector
- Queue

I am inclined to pick vector for chromossome and array for pop, but i would like the opinion of anyone with experience on the subject.

Thanks in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

睫毛溺水了 2024-08-02 22:58:51

我猜你想要随机访问群体和基因。 你说性能很重要,我把它理解为执行速度。 因此,您可能最好对染色体使用 vector<>,对基因使用 vector。 使用 vector 的原因是 bitset<>vector 针对内存消耗进行了优化,因此速度较慢。 vector 将以 x8 内存为代价提供更高的速度(假设系统上的 char = byte)。 因此,如果您想要速度,请使用 vector。 如果内存消耗非常重要,则使用 vectorbitsetbitset<> 在这里似乎是一个自然的选择,但是,请记住,它是根据位数进行模板化的,这意味着 a) 基因的数量必须是固定的并且在编译时已知时间(我猜这是一个很大的禁忌),b)如果您使用不同的大小,您最终会得到每个 bitset 的每个 bitset 大小的副本> 您使用的方法(尽管内联可能会否定这一点),即代码膨胀。 总的来说,如果您不需要 vector,我猜 vector 更适合您。

如果您担心 vector 的美观性,您可以 typedef chargene; 然后使用 vector,它看起来更自然。

string 就像 vector 但更麻烦。

I'm guessing you want random access to the population and to the genes. You say performance is important, which I interpret as execution speed. So you're probably best off using a vector<> for the chromosomes and a vector<char> for the genes. The reason for vector<char> is that bitset<> and vector<bool> are optimized for memory consumption, and are therefore slow. vector<char> will give you higher speed at the cost of x8 memory (assuming char = byte on your system). So if you want speed, go with vector<char>. If memory consumption is paramount, then use vector<bool> or bitset<>. bitset<> would seem like a natural choice here, however, bear in mind that it is templated on the number of bits, which means that a) the number of genes must be fixed and known at compile time (which I would guess is a big no-no), and b) if you use different sizes, you end up with one copy per bitset size of each of the bitset methods you use (though inlining might negate this), i.e., code bloat. Overall, I would guess vector<bool> is better for you if you don't want vector<char>.

If you're concerned about the aesthetics of vector<char> you could typedef char gene; and then use vector<gene>, which looks more natural.

A string is just like a vector<char> but more cumbersome.

我最亲爱的 2024-08-02 22:58:51

专门来回答你的问题。 我不太确定你的建议是什么。 你谈论数组和字符串类。 你是在谈论 STL 容器类吗,你可以在其中有一个队列、位集、向量、链表等。我建议你为你的群体提供一个向量(最接近 C 数组的东西),如果你是的话,为你的染色体建议一个位集担心内存容量。 否则,因为您已经在使用 DNA 字符串表示的向量。 (“10110110”)

对于想法和涉足的好工具。 推荐您下载并初步使用该库。 它适用于主要编译器。 适用于 Unix 变体。 有全部源代码。

所有框架的东西都为你完成了,你会学到很多东西。 稍后您可以从头开始编写自己的代码或从这些类继承。 如果需要,您也可以在商业代码中使用它们。

因为它们是对象,所以你可以轻松地改变 DNA 的表示,从整数到实数,到结构,到树,再到位数组等等。

总是需要学习治疗,但这是值得的。

我用它生成数千个神经网络,然后用一个简单的适应函数将它们淘汰,然后真正运行它们。

加利布

http://lancet.mit.edu/ga/

Specifically to answer your question. I am not exactly sure what you are suggestion. You talk about Array and string class. Are you talking about the STL container classes where you can have a queue, bitset, vector, linked list etc. I would suggest a vector for you population (closest thing to a C array there is) and a bitset for you chromosome if you are worried about memory capacity. Else as you are already using a vector of your string representaion of your dna. ("10110110")

For ideas and a good tool to dabble. Recommend you download and initially use this library. It works with the major compilers. Works on unix variants. Has all the source code.

All the framework stuff is done for you and you will learn a lot. Later on you could write your own code from scratch or inherit from these classes. You can also use them in commercial code if you want.

Because they are objects you can change representaion of your DNA easily from integers to reals to structures to trees to bit arrays etc etc.

There is always learning cure involved but it is worth it.

I use it to generate thousands of neural nets then weed them out with a simple fitness function then run them for real.

galib

http://lancet.mit.edu/ga/

赴月观长安 2024-08-02 22:58:51

假设您想自己编写代码(如果您想要一个外部库 kingchris 似乎有一个很好的库),这实际上取决于您需要执行哪种操作。 为了在内存方面获得最大的收益,您可以使用任何整数类型并通过位掩码等设置/操作各个位。现在这种方法在易用性方面可能不是最佳的...上面的字符串示例可以工作好的,但是它与 Shorts 没有显着不同,这里您现在只是用 8 位值而不是 16 位值来表示“0”或“1”。 此外,再次取决于操作,字符串大小写可能会变得笨拙。 因此,如果您可以提供有关算法的更多信息,我们也许可以提供更多反馈。 我自己喜欢将各个位作为整数(位集)的一部分,但如果您不习惯掩码、移位和所有这些好东西,它可能不适合您。

Assuming that you want to code this yourself (if you want an external library kingchris seems to have a good one there) it really depends on what kind of manipulation you need to do. To get the most bang for your buck in terms of memory, you could use any integer type and set/manipulate individual bits via bitmasks etc. Now this approach likely not optimal in terms of ease of use... The string example above would work ok, however again its not significantly different than the shorts, here you are now just representing either '0' or '1' with an 8 bit value as opposed to 16 bit value. Also, again depending on the manipulation, the string case will probably prove unwieldly. So if you could give some more info on the algorithm we could maybe give more feedback. Myself I like the individual bits as part of an integer (a bitset), but if you aren't used to masks, shifts, and all that good stuff it may not be right for you.

面犯桃花 2024-08-02 22:58:51

我建议为每个群体成员编写一个类,这会大大简化事情,因为您可以将所有与成员相关的函数保留在同一个位置,并用实际数据很好地包装起来。

如果您需要一个“布尔数组”,我建议使用一个 int 或多个 int(然后使用掩码和按位操作来访问(修改/翻转)每个位),具体取决于染色体的数量。

我通常对总体使用某种集合类,因为只有一组总体成员不允许您简单地添加到您的总体中。 我建议实现某种动态列表(如果您熟悉 ArrayList 那么这是一个很好的例子)。

我按照上面的方法在遗传算法方面取得了重大成功。 如果你正确地准备你的成员类,它确实可以简化事情,并允许你专注于编码更好的遗传算法,而不是担心你的数据结构。

I suggest writing a class for each member of population, that simplifies things considerably, since you can keep all your member relevant functions in the same place nicely wrapped with the actual data.

If you need a "array of bools" I suggest using an int or several ints (then use mask and bit wise operations to access (modify / flip) each bit) depending on number of your chromosomes.

I usually used some sort of collection class for the population, because just an array of population members doesn't allow you to simply add to your population. I would suggest implementing some sort of dynamic list (if you are familiar with ArrayList then that is a good example).

I had major success with genetic algorithms with the recipe above. If you prepare your member class properly it can really simplify things and allows you to focus on coding better genetic algorithms instead of worrying about your data structures.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文