确定组数据的匹配字符串,并创建指定存在或不存在更改的新列
假设我有以下数据集:
dat<- data.frame(ID= c("A","A","A","A","A","A","B","B", "B", "B"),
test= rep(c("pre","post"),5),
item= c(rep("item1",2), rep("item2",2), rep("item3", 2), rep("item1",2), rep("item2",2)),
answer= c("science","science","science","","", "science", "some multi word string that is not science", "history", "", "social science"))
我想在答案
中为ID> ID> ID
和item
的每个分组中的字符串的特定元素。我需要确定Science
的实例,例如,例如社会科学
,例如条目/字符串。 社会科学
包括Science
我只对Science
本身的实例感兴趣。
将创建一个称为change_type
的新列。
- 级别
两者
指示test> test
的两个级别是否存在 - 科学
of test
等于pre
post
指示Science
仅在test> test
的级别中存在等于发布
。
输出看起来像这样:
res<- data.frame(ID= c("A","A","A","B","B"),
item= c("item1","item2","item3","item1","item2"),
change_type=c("both","pre", "post", "NA", "NA"))
Let's say I have the following dataset:
dat<- data.frame(ID= c("A","A","A","A","A","A","B","B", "B", "B"),
test= rep(c("pre","post"),5),
item= c(rep("item1",2), rep("item2",2), rep("item3", 2), rep("item1",2), rep("item2",2)),
answer= c("science","science","science","","", "science", "some multi word string that is not science", "history", "", "social science"))
I want to identify a specific element of the strings in answer
for each grouping of ID
and item
. I need to identify instances of science
excluding, for example, entries/strings like social science
. While social science
includes the word science
I am only interested in instances where science
is by itself.
A new column will be created called change_type
.
- The level
both
indicates if science was present in both levels oftest
, pre
indicatesscience
was only present in levels oftest
equal topre
post
indicatesscience
was only present in levels oftest
equal topost
.
The output will look like this:
res<- data.frame(ID= c("A","A","A","B","B"),
item= c("item1","item2","item3","item1","item2"),
change_type=c("both","pre", "post", "NA", "NA"))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我们可以使用
case_when
进行操作:We could do it with
case_when
: