使用 AWK 对关联数组进行排序
这是我的数组(gawk 脚本):
myArray["peter"] = 32
myArray["bob"] = 5
myArray["john"] = 463
myArray["jack"] = 11
排序后,我需要以下结果:
bob 5
jack 11
peter 32
john 463
当我使用“asort”时,索引丢失。如何在不丢失索引的情况下按数组值排序? (我需要根据它们的值进行有序索引)
(我需要仅使用 awk/gawk 获得此结果,而不是 shell 脚本、perl 等)
如果我的帖子不够清楚,这里是另一篇解释同一问题的帖子: http://www.experts-exchange.com/Programming/Languages /Scripting/Shell/Q_26626841.html )
提前致谢
更新:
谢谢你们俩,但我需要按值排序,而不是索引(我想根据它们的值排序索引) 。
换句话说,我需要这个结果:
bob 5
jack 11
peter 32
john 463
不是:(
bob 5
jack 11
john 463
peter 32
我同意,我的例子很混乱,选择的值非常糟糕)
从 Catcall 的代码中,我编写了一个可以工作的快速实现,但它相当丑陋(我连接键和; 比较期间排序和拆分之前的值)。它看起来是这样的:
function qsort(A, left, right, i, last) {
if (left >= right)
return
swap(A, left, left+int((right-left+1)*rand()))
last = left
for (i = left+1; i <= right; i++)
if (getPart(A[i], "value") < getPart(A[left], "value"))
swap(A, ++last, i)
swap(A, left, last)
qsort(A, left, last-1)
qsort(A, last+1, right)
}
function swap(A, i, j, t) {
t = A[i]; A[i] = A[j]; A[j] = t
}
function getPart(str, part) {
if (part == "key")
return substr(str, 1, index(str, "#")-1)
if (part == "value")
return substr(str, index(str, "#")+1, length(str))+0
return
}
BEGIN { }
{ }
END {
myArray["peter"] = 32
myArray["bob"] = 5
myArray["john"] = 463
myArray["jack"] = 11
for (key in myArray)
sortvalues[j++] = key "#" myArray[key]
qsort(sortvalues, 0, length(myArray));
for (i = 1; i <= length(myArray); i++)
print getPart(sortvalues[i], "key"), getPart(sortvalues[i], "value")
}
当然,如果你有更干净的东西,我很感兴趣......
感谢你的时间
Here's my array (gawk script) :
myArray["peter"] = 32
myArray["bob"] = 5
myArray["john"] = 463
myArray["jack"] = 11
After sort, I need the following result :
bob 5
jack 11
peter 32
john 463
When i use "asort", indices are lost. How to sort by array value without losing indices ? (I need ordered indices based on their values)
(I need to obtain this result with awk/gawk only, not shell script, perl, etc)
If my post isn't clear enough, here is an other post explaining the same issue : http://www.experts-exchange.com/Programming/Languages/Scripting/Shell/Q_26626841.html )
Thanks in advance
Update :
Thanks to you both, but i need to sort by values, not indices (i want ordered indices according to their values).
In other terms, i need this result :
bob 5
jack 11
peter 32
john 463
not :
bob 5
jack 11
john 463
peter 32
(I agree, my example is confusing, the chosen values are pretty bad)
From the code of Catcall, I wrote a quick implementation that works, but it's rather ugly (I concatenate keys & values before sort and split during comparison). Here's what it looks like :
function qsort(A, left, right, i, last) {
if (left >= right)
return
swap(A, left, left+int((right-left+1)*rand()))
last = left
for (i = left+1; i <= right; i++)
if (getPart(A[i], "value") < getPart(A[left], "value"))
swap(A, ++last, i)
swap(A, left, last)
qsort(A, left, last-1)
qsort(A, last+1, right)
}
function swap(A, i, j, t) {
t = A[i]; A[i] = A[j]; A[j] = t
}
function getPart(str, part) {
if (part == "key")
return substr(str, 1, index(str, "#")-1)
if (part == "value")
return substr(str, index(str, "#")+1, length(str))+0
return
}
BEGIN { }
{ }
END {
myArray["peter"] = 32
myArray["bob"] = 5
myArray["john"] = 463
myArray["jack"] = 11
for (key in myArray)
sortvalues[j++] = key "#" myArray[key]
qsort(sortvalues, 0, length(myArray));
for (i = 1; i <= length(myArray); i++)
print getPart(sortvalues[i], "key"), getPart(sortvalues[i], "value")
}
Of course I'm interested if you have something more clean...
Thanks for your time
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
以下函数适用于 Gawk 3.1.7,并且不采用上述任何解决方法(无外部排序命令、无数组下标等)。它只是适用于关联数组的插入排序算法的基本实现。
您向其传递一个关联数组以对值进行排序,并传递一个空数组以填充相应的键。
这是实现:
您可以通过简单地按降序迭代结果数组来“反向排序”。
The following function works in Gawk 3.1.7 and doesn't resort to any of the workarounds described above (no external sort command, no array subscripts, etc.) It's just a basic implementation of the insertion sort algorithm adapted for associative arrays.
You pass it an associative array to sort on the values and an empty array to populate with the corresponding keys.
Here is the implementation:
You can "reverse sort" by simply iterating the resulting array in descending order.
编辑:
按值排序
哦!要对值进行排序,有点麻烦,但您可以使用原始数组的值和索引的串联作为新数组中的索引来创建一个临时数组。然后,您可以
asorti()
临时数组并将连接的值拆分回索引和值。如果您无法理解那些令人费解的描述,那么代码会更容易理解。它也很短。编辑2:
如果您有GAWK 4,您可以按值的顺序遍历数组,而无需执行显式排序:
有按索引或值、升序或降序等选项遍历的设置。您还可以指定自定义函数。
上一个答案:
按索引排序
如果您有 AWK,例如
gawk
3.1.2 或更高版本,它支持asorti()
:如果您没有
asorti()
:Edit:
Sort by values
Oh! To sort the values, it's a bit of a kludge, but you can create a temporary array using a concatenation of the values and the indices of the original array as indices in the new array. Then you can
asorti()
the temporary array and split the concatenated values back into indices and values. If you can't follow that convoluted description, the code is much easier to understand. It's also very short.Edit 2:
If you have GAWK 4, you can traverse the array by order of values without performing an explicit sort:
There are settings for traversing by index or value, ascending or descending and other options. You can also specify a custom function.
Previous answer:
Sort by indices
If you have an AWK, such as
gawk
3.1.2 or greater, which supportsasorti()
:If you don't have
asorti()
:使用带有管道的 Unix sort 命令,保持 Awk 代码简单并遵循 Unix 哲学
创建一个输入文件,其中的值以逗号分隔
彼得,32
杰克,11
约翰,463
bob,5
使用代码创建一个 sort.awk 文件
运行程序,应该给你输出
$ awk -f sort.awk 数据
鲍勃,5
杰克,11
彼得,32
约翰,463
Use the Unix sort command with the pipe, keeps Awk code simple and follow Unix philosophy
Create a input file with values seperated by comma
peter,32
jack,11
john,463
bob,5
Create a sort.awk file with the code
Run the program, should give you the output
$ awk -f sort.awk data
bob,5
jack,11
peter,32
john,463
在迭代数组之前,请使用上面的语句。但是,它可以在 awk 版本 4.0.1 中运行。它在 awk 版本 3.1.7 中不起作用。
我不确定它是在哪个中间版本中引入的。
Before iterating an array, use the above statement. But, it works in awk version 4.0.1. It does not work in awk version 3.1.7.
I am not sure in which intermediate version, it got introduced.
简单的答案...
And the simple answer...
使用分类:
Use asorti: