如何为“ n＆quord”创建通用的字母和数字字符串R中的簇要添加数据框架？

发布于 2025-01-29 11:42:20 字数 2318 浏览 1 评论 0原文

有

df<-structure(list(x = c(-0.803739264931451, 0.852850728148773, 0.927179506105653, -0.752626056626365, 0.706846224294882, 1.0346985222527, -0.475845197699957, -0.460301566967151, -0.680301544955355, -1.03196929988978), y = c(-0.853052609097935, 0.367618436999606, -0.274902437566225, -0.511565170496435, 0.81067919693492, 0.394655023166806, 0.989760805249143, -0.858997792847955, -0.66149481321353, -0.0219935446644728), shape = c(1, 1, 2, 2, 2, 2, 3, 3, 4, 4)), row.names = c(NA, 10L), class = "data.frame")

输出

x	y	形状
-0.8037393	-0.85305261	1
0.8528507	0.36761844	1
0.9271795	-0.27490244	4
-0.4603016	-0.85899779	-1.0319693
0.98976081	-0.021993544	我
4758452	：	-0.6803015
3
	-0.66149481	3

	：	4

预期输出： 如何为R中的“ N”簇创建一个通用字符串和数字以添加数据框，如下所示：

obs：例如，如果有100个簇，cluster 100的标签可能是AA1和因此

df$label<-   #What is the correct code for this problem?

x	y	形状	标签
-0.8037393	-0.85305261	1	A1
0.8528507	0.36761844	1	A2
0.9271795	-0.27490244	5	3
-0.4603016	-0.4758452	2	B4
0.39465502	0.98976081	3	-0.85899779

			C1
		，	C2
-0.6803015	-0.66149481	4	D1
-1.0319693	-0.02199354	4	D2

原文

I have this:

df<-structure(list(x = c(-0.803739264931451, 0.852850728148773, 0.927179506105653, -0.752626056626365, 0.706846224294882, 1.0346985222527, -0.475845197699957, -0.460301566967151, -0.680301544955355, -1.03196929988978), y = c(-0.853052609097935, 0.367618436999606, -0.274902437566225, -0.511565170496435, 0.81067919693492, 0.394655023166806, 0.989760805249143, -0.858997792847955, -0.66149481321353, -0.0219935446644728), shape = c(1, 1, 2, 2, 2, 2, 3, 3, 4, 4)), row.names = c(NA, 10L), class = "data.frame")

Output:

x	y	shape
-0.8037393	-0.85305261	1
0.8528507	0.36761844	1
0.9271795	-0.27490244	2
-0.7526261	-0.51156517	2
0.7068462	0.81067920	2
1.0346985	0.39465502	2
-0.4758452	0.98976081	3
-0.4603016	-0.85899779	3
-0.6803015	-0.66149481	4
-1.0319693	-0.02199354	4

Expected output:
How to create a generic string of letters and numbers for "n" clusters in R to add in a dataframe,as shown below:

obs: for example, if there were 100 clusters, the label of cluster 100 could be AA1 and so on.

df$label<-   #What is the correct code for this problem?

x	y	shape	label
-0.8037393	-0.85305261	1	A1
0.8528507	0.36761844	1	A2
0.9271795	-0.27490244	2	B1
-0.7526261	-0.51156517	2	B2
0.7068462	0.81067920	2	B3
1.0346985	0.39465502	2	B4
-0.4758452	0.98976081	3	C1
-0.4603016	-0.85899779	3	C2
-0.6803015	-0.66149481	4	D1
-1.0319693	-0.02199354	4	D2

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

栖迟 2025-02-05 11:42:20

这是一个应该为您执行的小功能：

f <- function(g,n) {
  letter_index = if_else(g%%26 ==0, 26, g%%26)
  paste0(
    paste0(rep(LETTERS[letter_index], times = ceiling(g/26)), collapse=""),
    1:n)
}

现在使用group_by（） and mutate（）

df %>% 
  group_by(shape) %>% 
  mutate(code = f(cur_group_id(), n()))

输出：

        x       y shape code 
    <dbl>   <dbl> <dbl> <chr>
 1 -0.804 -0.853      1 A1   
 2  0.853  0.368      1 A2   
 3  0.927 -0.275      2 B1   
 4 -0.753 -0.512      2 B2   
 5  0.707  0.811      2 B3   
 6  1.03   0.395      2 B4   
 7 -0.476  0.990      3 C1   
 8 -0.460 -0.859      3 C2   
 9 -0.680 -0.661      4 D1   
10 -1.03  -0.0220     4 D2

说明：

函数<代码> f（）需要两个值，一个指示组号的整数号（cur_groupid（））以及该shape> shape值中的值数量（通过n（）通过。在函数中，我们使用modulo获取正确的次数来复制字母值，然后我们将其粘贴到序列从1到n

Here is a small function that should do it for you:

f <- function(g,n) {
  letter_index = if_else(g%%26 ==0, 26, g%%26)
  paste0(
    paste0(rep(LETTERS[letter_index], times = ceiling(g/26)), collapse=""),
    1:n)
}

Now apply that function to each shape value, using group_by() and mutate()

df %>% 
  group_by(shape) %>% 
  mutate(code = f(cur_group_id(), n()))

Output:

        x       y shape code 
    <dbl>   <dbl> <dbl> <chr>
 1 -0.804 -0.853      1 A1   
 2  0.853  0.368      1 A2   
 3  0.927 -0.275      2 B1   
 4 -0.753 -0.512      2 B2   
 5  0.707  0.811      2 B3   
 6  1.03   0.395      2 B4   
 7 -0.476  0.990      3 C1   
 8 -0.460 -0.859      3 C2   
 9 -0.680 -0.661      4 D1   
10 -1.03  -0.0220     4 D2

Explanation:

The function f() takes two values, an integer number indicating the group number (passed by cur_groupid()) and the number of values in that shape value (passed by n()). In the function, we use modulo to get the right number of times to replicate the LETTERS value, and then we paste it to the sequence from 1 to n

回复收藏 0 原文

~没有更多了~