如何为“ n&quord”创建通用的字母和数字字符串R中的簇要添加数据框架?

发布于 2025-01-29 11:42:20 字数 2318 浏览 1 评论 0原文

df<-structure(list(x = c(-0.803739264931451, 0.852850728148773, 0.927179506105653, -0.752626056626365, 0.706846224294882, 1.0346985222527, -0.475845197699957, -0.460301566967151, -0.680301544955355, -1.03196929988978), y = c(-0.853052609097935, 0.367618436999606, -0.274902437566225, -0.511565170496435, 0.81067919693492, 0.394655023166806, 0.989760805249143, -0.858997792847955, -0.66149481321353, -0.0219935446644728), shape = c(1, 1, 2, 2, 2, 2, 3, 3, 4, 4)), row.names = c(NA, 10L), class = "data.frame")

输出

xy形状
-0.8037393-0.853052611
0.85285070.367618441
0.9271795-0.274902444
-0.4603016-0.85899779-1.0319693
0.98976081-0.021993544
4758452-0.6803015
3
-0.661494813
4

预期输出: 如何为R中的“ N”簇创建一个通用字符串和数字以添加数据框,如下所示:

obs:例如,如果有100个簇,cluster 100的标签可能是AA1和因此

df$label<-   #What is the correct code for this problem?
xy形状标签
-0.8037393-0.853052611A1
0.85285070.367618441A2
0.9271795-0.2749024453
-0.4603016-0.47584522B4
0.394655020.989760813-0.85899779
C1
C2
-0.6803015-0.661494814D1
-1.0319693-0.021993544D2

I have this:

df<-structure(list(x = c(-0.803739264931451, 0.852850728148773, 0.927179506105653, -0.752626056626365, 0.706846224294882, 1.0346985222527, -0.475845197699957, -0.460301566967151, -0.680301544955355, -1.03196929988978), y = c(-0.853052609097935, 0.367618436999606, -0.274902437566225, -0.511565170496435, 0.81067919693492, 0.394655023166806, 0.989760805249143, -0.858997792847955, -0.66149481321353, -0.0219935446644728), shape = c(1, 1, 2, 2, 2, 2, 3, 3, 4, 4)), row.names = c(NA, 10L), class = "data.frame")

Output:

xyshape
-0.8037393-0.853052611
0.85285070.367618441
0.9271795-0.274902442
-0.7526261-0.511565172
0.70684620.810679202
1.03469850.394655022
-0.47584520.989760813
-0.4603016-0.858997793
-0.6803015-0.661494814
-1.0319693-0.021993544

Expected output:
How to create a generic string of letters and numbers for "n" clusters in R to add in a dataframe,as shown below:

obs: for example, if there were 100 clusters, the label of cluster 100 could be AA1 and so on.

df$label<-   #What is the correct code for this problem?
xyshapelabel
-0.8037393-0.853052611A1
0.85285070.367618441A2
0.9271795-0.274902442B1
-0.7526261-0.511565172B2
0.70684620.810679202B3
1.03469850.394655022B4
-0.47584520.989760813C1
-0.4603016-0.858997793C2
-0.6803015-0.661494814D1
-1.0319693-0.021993544D2

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

栖迟 2025-02-05 11:42:20

这是一个应该为您执行的小功能:

f <- function(g,n) {
  letter_index = if_else(g%%26 ==0, 26, g%%26)
  paste0(
    paste0(rep(LETTERS[letter_index], times = ceiling(g/26)), collapse=""),
    1:n)
}

现在使用group_by() and mutate()

df %>% 
  group_by(shape) %>% 
  mutate(code = f(cur_group_id(), n()))

输出:

        x       y shape code 
    <dbl>   <dbl> <dbl> <chr>
 1 -0.804 -0.853      1 A1   
 2  0.853  0.368      1 A2   
 3  0.927 -0.275      2 B1   
 4 -0.753 -0.512      2 B2   
 5  0.707  0.811      2 B3   
 6  1.03   0.395      2 B4   
 7 -0.476  0.990      3 C1   
 8 -0.460 -0.859      3 C2   
 9 -0.680 -0.661      4 D1   
10 -1.03  -0.0220     4 D2

说明:

  • 函数<代码> f()需要两个值,一个指示组号的整数号(cur_groupid())以及该shape> shape值中的值数量(通过n()通过。在函数中,我们使用modulo获取正确的次数来复制字母值,然后我们将其粘贴到序列从1到n

Here is a small function that should do it for you:

f <- function(g,n) {
  letter_index = if_else(g%%26 ==0, 26, g%%26)
  paste0(
    paste0(rep(LETTERS[letter_index], times = ceiling(g/26)), collapse=""),
    1:n)
}

Now apply that function to each shape value, using group_by() and mutate()

df %>% 
  group_by(shape) %>% 
  mutate(code = f(cur_group_id(), n()))

Output:

        x       y shape code 
    <dbl>   <dbl> <dbl> <chr>
 1 -0.804 -0.853      1 A1   
 2  0.853  0.368      1 A2   
 3  0.927 -0.275      2 B1   
 4 -0.753 -0.512      2 B2   
 5  0.707  0.811      2 B3   
 6  1.03   0.395      2 B4   
 7 -0.476  0.990      3 C1   
 8 -0.460 -0.859      3 C2   
 9 -0.680 -0.661      4 D1   
10 -1.03  -0.0220     4 D2

Explanation:

  • The function f() takes two values, an integer number indicating the group number (passed by cur_groupid()) and the number of values in that shape value (passed by n()). In the function, we use modulo to get the right number of times to replicate the LETTERS value, and then we paste it to the sequence from 1 to n
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文