Erlang 中可以进行任意数据分析吗?
我想回答有关 Erlang 中的数据的问题:计数、关联消息、提供任意统计数据。 我曾考虑过为此求助于 Hadoop,但是是否有可能在原始 Erlang 中构建一个解决方案来进行任意数据分析,不一定是通过 Map/Reduce,而是以某种方式进行? 我看到有人这样做的一些暗示,但没有明确的博客文章或这样做的例子。 我知道 Powerset 的自然语言功能是用 Erlang 编写的。 我也了解 CouchDB,但一直在寻找其他一些解决方案。
I want to answer questions about data in Erlang: count things, correlate messages, provide arbitrary statistics. I had thought about resorting to Hadoop for this but is it possible to build a solution in raw Erlang to do rather arbitrary data analysis not necessarily via map/reduce but somehow? I have seen some hints of people doing this but no explicit blog posts or examples of this being done. I know that Powerset's natural language capabilities are written in Erlang. I also know about CouchDB but was looking for some other solutions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
是的。
对于通用计算和统计,Erlang 工作得很好。 它没有针对此类工作进行大量优化,因此它将很难跟上 MatLab、ForTran 或任何主要 C 软件包中类似的数字代码的工作,但对于大多数用途来说,它会做得很好。 当然,如果您的代码能够巧妙地并行化,并且您有多个可用的 CPU,Erlang 将更容易迎头赶上。
(您还提到了映射/归约模式;考虑到 Erlang/OTP 运行时和库,它相对微不足道。)
我和我的同事已经编写了大量“原始”Erlang 来进行计数、统计等。 我们发现它足以完成大多数任务。
Yes.
For general-purpose computation and statistics, Erlang works just fine. It isn't optimized heavily for such work, so it will have trouble keeping up with similar numeric code in, say MatLab, ForTran, or any of the major C package for this work -- but for most uses it will do just fine. And of course if your code parallelizes neatly and you have multiple CPUs available, Erlang will catch up more easily.
(You also mentioned the map/reduce pattern; it is relatively trivial given the Erlang/OTP runtime and libraries.)
I and my colleagues have written plenty of "raw" Erlang to do counting, statistics, and so on. We have found it to be more than sufficient for most tasks.