创建大型数据框
假设我想从头开始生成一个大数据框。
使用 data.frame 函数是我通常创建数据框的方式。 然而,像下面这样的 df 非常容易出错并且效率低下。
那么有没有更有效的方法来创建以下数据框。
df <- data.frame(GOOGLE_CAMPAIGN=c(rep("Google - Medicare - US", 928), rep("MedicareBranded", 2983),
rep("Medigap", 805), rep("Medigap Branded", 1914),
rep("Medicare Typos", 1353), rep("Medigap Typos", 635),
rep("Phone - MedicareGeneral", 585),
rep("Phone - MedicareBranded", 2967),
rep("Phone-Medigap", 812),
rep("Auto Broad Match", 27),
rep("Auto Exact Match", 80),
rep("Auto Exact Match", 875)),
GOOGLE_AD_GROUP=c(rep("Medicare", 928), rep("MedicareBranded", 2983),
rep("Medigap", 805), rep("Medigap Branded", 1914),
rep("Medicare Typos", 1353), rep("Medigap Typos", 635),
rep("Phone ads 1-Medicare Terms",585),
rep("Ad Group #1", 2967), rep("Medigap-phone", 812),
rep("Auto Insurance", 27),
rep("Auto General", 80),
rep("Auto Brand", 875)))
哎呀,那是一些“坏”代码。如何以更有效的方式生成这个“大”数据框?
Let's say that I want to generate a large data frame from scratch.
Using the data.frame function is how I would generally create data frames.
However, df's like the following are extremely error prone and inefficient.
So is there a more efficient way of creating the following data frame.
df <- data.frame(GOOGLE_CAMPAIGN=c(rep("Google - Medicare - US", 928), rep("MedicareBranded", 2983),
rep("Medigap", 805), rep("Medigap Branded", 1914),
rep("Medicare Typos", 1353), rep("Medigap Typos", 635),
rep("Phone - MedicareGeneral", 585),
rep("Phone - MedicareBranded", 2967),
rep("Phone-Medigap", 812),
rep("Auto Broad Match", 27),
rep("Auto Exact Match", 80),
rep("Auto Exact Match", 875)),
GOOGLE_AD_GROUP=c(rep("Medicare", 928), rep("MedicareBranded", 2983),
rep("Medigap", 805), rep("Medigap Branded", 1914),
rep("Medicare Typos", 1353), rep("Medigap Typos", 635),
rep("Phone ads 1-Medicare Terms",585),
rep("Ad Group #1", 2967), rep("Medigap-phone", 812),
rep("Auto Insurance", 27),
rep("Auto General", 80),
rep("Auto Brand", 875)))
Yikes, that is some 'bad' code. How can I generate this 'large' data frame in a more efficient manner?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您获取该信息的唯一来源是一张纸,那么您可能不会得到比这更好的信息,但您至少可以将所有这些信息整合到一个
rep
调用每一列:这应该是相同的:
但是如果这些信息已经以某种方式存在于 R 中的数据结构中,并且您只需要转换它,那可能会更容易,但我们' d 需要知道该结构是什么。
If your only source for that information is a piece of paper, then you probably won't get much better than that, but you can at least consolidate all that into a single
rep
call for each column:which should be the same:
But if this information is already in a data structure in R somehow and you just need to transform it, that could possibly be even easier, but we'd need to know what that structure is.
手动,(1) 创建此数据框:
以及 (2) 此长度向量:
根据这两个输入(
dfu
和lens
),我们可以重建df< /code> (此处称为 df2):
Manually, (1) create this data frame:
and (2) this vector of lengths:
From these two inputs (
dfu
andlens
) we can reconstructdf
(here calleddf2
):