Clojure:“转置”有效地提供中等大小的地图列表
我需要一种非常快速有效的方法来“转置”clojure 中的地图列表。
假设我有:
(def monthly-sales [{:month 1 :pc "A" :sales 100}
{:month 2 :pc "B" :sales 200} ... {:month 12 :pc "Z" :sales 100}])
我需要有类似的东西:
|PC|1|2|3|4|5|6|7|8|9|10|11|12|
|A|100
I need a very fast and efficient way to "transpose" a list of maps in clojure.
Let's say I have:
(def monthly-sales [{:month 1 :pc "A" :sales 100}
{:month 2 :pc "B" :sales 200} ... {:month 12 :pc "Z" :sales 100}])
I need to have something like:
|PC|1|2|3|4|5|6|7|8|9|10|11|12|
|A|100
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
||
|Etc.|
我回答下面的问题:
基本上我按月对所有值进行分组(分组,请注意,由于“apply juxt”,它可以在多个键上键入),这是该列的键。完成此操作后,我推断出 pc 的唯一值,这将是该行的键。休息应该是不言自明的。
你认为这是清晰的 clojurian 设计吗?可以更高效、更清晰吗?
有用的链接:
http://pramode.net/clojure/2010/06/ 01/clojure 中的惰性序列/
||
|Etc.|
I answer the question below:
Basically I grouped all the values by month (group-by, notice that it can be keyed on more than 1 key thanks to "apply juxt"), this is the key for the column. Done that, I extrapolate the unique values of pc, this would be the key for the row. Rest should be self explanatory.
Do you think this is clear clojurian design? Can it be more efficient and clear?
Useful Links:
http://pramode.net/clojure/2010/06/01/lazy-sequences-in-clojure/
惯用的 clojure 库(如 clojure.java.jdbc)将提供这些长列表作为惰性 seq。这意味着您只需要足够的内存来包含一行,加上加载 clojure 和库的通常开销 - 前提是您从文件或数据库获取数据并将其写入流/数据库/其他内容,而不是将其全部保留在其中记忆。
至于您要求的转换,给定一个称为结果集的行(映射)序列,类似于:
会给您一个惰性序列,您可以将其转储到文件中以生成类似 | 的内容。分离你想要的数据。
附录:至于“快速”——除非您的存储设置不寻常,否则这可能比您的存储 I/O 快得多——而且很简单。
Idiomatic clojure libraries (like clojure.java.jdbc) will provide these long lists as lazy seqs. That means you just need enough memory to contain a single row plus the usual overhead for loading clojure and the libraries - provided you get the data from a file or database and write it out to a stream/db/whatever and not keep it all in memory.
As for the transform you're asking for, given a seq of rows (maps) called result-set, something like:
Will give you a lazy seq that you can just dump to a file to produce something like the | separated data you want.
Addendum: as for "fast" - unless your storage setup is unusual, this is likely to significantly faster than your storage I/O - and it's straight forward.
本文中没有任何内容表明您希望通过处理此数据集实现什么最终目标。至少,我不认为主要思想是将 1GB 数据放入 HTML 表中。因此,无法提供如何最好地实现这一目标的信息。仅重新排列相同的数据不会给出任何有意义的结果,也不会更改您以后想要执行的操作的内存或访问要求。
首先,您显示的“基本”数据看起来可能是至少三个关系表上的联接查询的结果(如果正确规范化)。通过 SQL 直接从这些表中获取信息可能会更有效,在 Clojure 本身内处理之前已经减少了信息量、过滤或排序。
如果不是,正确规范化数据并将其存储在数据库中可能是一种选择,但这一切都取决于您最终要如何处理数据。
Nothing in this post suggests what end goal you want to achieve by processing this dataset. At least, I don't think the main idea could be putting 1GB of data into a HTML table. As such, no information can be given how this is best achieved. Just rearranging the same data isn't going to give any meaningful results, or change the memory or access requirements for the operations you want to do afterwards.
To begin with, what you show as 'base' data looks like it could be the results from a joined query on at least three relational tables (if properly normalized). It might be much more efficient to get information directly from these tables by SQL, already cutting down the amount of information, filtering or sorting before processing within Clojure itself.
If it's not, properly normalizing the data and storing it in a database might be an option, but all depends on what you want to do with the data in the end.