将数据框归一化的每列,该列具有不在数据集中的特定最大值
您好,我有一个带有20列的数据框架,但这是可重复的副本:
test_df <- data.frame(a = sample(1:20,7), b = sample(1:50,7), c= sample(1:29,7) )
max_values <- c(20,50,29)
我想用其“ max_values”的相应索引将每个列标准化,请不要假设每个列的最大值都等于最大值i i希望该列被标准化为。如果它超过1且低于零,则可以。最大值是阈值,我希望观察我所拥有的数据如何超越或下方。我们可以假设最小值始终将是0,所以我将它们从方程式中移开:
normalize <- function(x,y) {
return ((x - 0) / (y - 0))
}
lapply(test_df, normalize)
我已经编写了上面的代码,但是我不知道如何设置它,以便每次迭代都对应于不同的索引” max_values”
Hello I have a data frame with 20 columns but here is a reproducible copy:
test_df <- data.frame(a = sample(1:20,7), b = sample(1:50,7), c= sample(1:29,7) )
max_values <- c(20,50,29)
I want to normalize each column with the corresponding index of its "max_values", please do not assume each column's max value is going to be equal to the max value I want that column to be normalized as. It is okay if it goes above 1 and below zero. The max values are the thresholds and I would like the observe how the data I have goes beyond or below it. We can assume that the min values are ALWAYS going to be 0, so I took them away from the equation:
normalize <- function(x,y) {
return ((x - 0) / (y - 0))
}
lapply(test_df, normalize)
I have written the code above, but I do not know how to set it so that each iteration corresponds to a different index of "max_values"
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以使用
比例
或除以列表
数据
You might use
scale
Or divide by a list
data
尝试以下操作:
只要
max_values
和test_df
具有相同顺序的列,您只需要行行行。烦人的应用
为您提供了行和COLS切换的结果。t
将它们切换回。Try this:
As long as
max_values
andtest_df
have the columns in the same order, you just need to go row by row. Annoyinglyapply
give you the result with rows and cols switched.t
switches them back.使用
mapply
如果您的功能中有多个参数:Use
mapply
if you have more than one parameter in your function: