R:替换另一个数据框中存在的值

发布于 2025-01-09 06:33:21 字数 802 浏览 1 评论 0原文

我想在 R 中可视化多个数据集。不幸的是,数据集之间的命名法不一致或使用同义词(例如“apple”拼写为“apple”、“Apple”和“APPLE”)。

我有一个跨数据集引用命名法的数据框:

名称数据集 A名称数据集 B名称数据集 C
AppleAPPLEapple
PearPEARNA
MelonNAmelon

我想让事情保持一致,例如迭代数据集 B 和 C 并将其命名法替换为数据集 A 的数据集(如果可用)。有人有什么建议吗?

提前致谢!

I have multiple datasets that I would like to visualize in R. Unfortunately, the nomenclatur across datasets is not consistent or uses synonyms (e.g. "apple" is spelled "apple", "Apple" and "APPLE").

I have a dataframe that references the nomenclatur across datasets:

Name Dataset AName Dataset BName Dataset C
AppleAPPLEapple
PearPEARNA
MelonNAmelon

I would like to make things consistent, e.g. to iterate through datasets B and C and replace their nomenclatur with that of dataset A (if available). Would anyone have any recommendations?

Thanks in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

¢好甜 2025-01-16 06:33:21

如果您只想修改某些字符的大小写,也许您可​​以将数据转换为列表,然后递归地应用函数。您可以尝试这样的操作:

df = data.frame(col1 =c("Apple", "Pear", "Melon"))

df1 = data.frame(col1 =c("APPLE", "PEAR", NA))

df2 = data.frame(col1 =c("apple", NA, "melon"))

dflist = mget(ls(pattern = "df")) # Put all the data frames in a list

然后您可以将函数应用于每个元素,例如,使用 rapply

thelist = rapply(dflist, tolower, how = "list")

输出

$df
$df$col1
[1] "apple" "pear"  "melon"


$df1
$df1$col1
[1] "apple" "pear"  NA     


$df2
$df2$col1
[1] "apple" NA      "melon"

可以对列表应用其他字符串操作,例如,搜索模式并使用 gsub( )lapply()

thelist2 = lapply(thelist, "[[", "col1") |> # Extracting "col1"
    lapply(\(x) gsub('apple', 'pink lady', x)) # Replace 'apple' with 'pink lady'

输出

$df
[1] "pink lady" "pear"      "melon"    

$df1
[1] "pink lady" "pear"      NA         

$df2
[1] "pink lady" NA          "melon"  

您还可以使用 rapply 实现类似的方法:

thelist3 = rapply(thelist, \(x) gsub('apple', 'pink lady', x), how = 'list')

根据数据的结构,您还可以连接数据框,然后将函数应用为需要。

If you only want to modify the capitalization of some characters, perhaps you can convert the data to a list and then apply a function recursively. You can try something like this:

df = data.frame(col1 =c("Apple", "Pear", "Melon"))

df1 = data.frame(col1 =c("APPLE", "PEAR", NA))

df2 = data.frame(col1 =c("apple", NA, "melon"))

dflist = mget(ls(pattern = "df")) # Put all the data frames in a list

Then you can apply the functions to each element, e.g., transform all the words to lower case using rapply

thelist = rapply(dflist, tolower, how = "list")

Output

$df
$df$col1
[1] "apple" "pear"  "melon"


$df1
$df1$col1
[1] "apple" "pear"  NA     


$df2
$df2$col1
[1] "apple" NA      "melon"

Additional string manipulation can be applied to the list, e.g., searching for a pattern and replace using gsub() and lapply():

thelist2 = lapply(thelist, "[[", "col1") |> # Extracting "col1"
    lapply(\(x) gsub('apple', 'pink lady', x)) # Replace 'apple' with 'pink lady'

Output

$df
[1] "pink lady" "pear"      "melon"    

$df1
[1] "pink lady" "pear"      NA         

$df2
[1] "pink lady" NA          "melon"  

You can also have a similar approach using rapply:

thelist3 = rapply(thelist, \(x) gsub('apple', 'pink lady', x), how = 'list')

Depending on the structure of your data, you can also join the data frames and then apply the functions as needed.

狼性发作 2025-01-16 06:33:21

如果您有这样的名称和数据:

df_names = data.frame(names_for_a = c("apple", "orange"),
                      names_for_b = c("pink lady", "ORANGE"))
df_b = data.frame(index = 1:9, name = rep(c("pink lady", "ORANGE", "ORANGE"), 3))

我会做类似的事情:

df_b$name = df_names$names_for_a[match(df_b$name, df_names$names_for_b)]

该线程也可能对您正在做的事情有帮助:根据查找表替换数据框中的值

If you have names and data like this:

df_names = data.frame(names_for_a = c("apple", "orange"),
                      names_for_b = c("pink lady", "ORANGE"))
df_b = data.frame(index = 1:9, name = rep(c("pink lady", "ORANGE", "ORANGE"), 3))

I'd do something like:

df_b$name = df_names$names_for_a[match(df_b$name, df_names$names_for_b)]

This thread may be helpful for what you are doing too: Replace values in a dataframe based on lookup table

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文