当前位置：文江博客话题详情

获取字典中的最大值

发布于 2024-10-15 23:40:35 字数 197 浏览 12 评论 0原文

我面临着这个问题。我的字典中有 10,000 行，这是其中一行

示例：A (8) C (4) G (48419) T (2) 打印出来时

我想得到“G”作为答案，因为它具有最高的价值。

我目前正在使用 Python 2.4，但我不知道如何解决这个问题，因为我对 Python 还很陌生。

非常感谢您提供的任何帮助:)

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

失退 2024-10-22 23:40:35

这是一个解决方案，

使用正则表达式扫描所有出现的大写字母，后跟括号中的数字，
将正则表达式中的字符串对与生成器表达式转换为 (value,key) 元组，
从具有最高值的元组中返回键

我还添加了一个 main 函数，以便该脚本可以用作命令行工具来读取一个文件中的所有行，并将每行具有最高值的密钥写入输出文件。该程序使用迭代器，因此无论输入文件有多大，它都具有内存效率。

import re
KEYVAL = re.compile(r"([A-Z])\s*\((\d+)\)")

def max_item(row):
    return max((int(v),k) for k,v in KEYVAL.findall(row))[1]

def max_item_lines(fh):
    for row in fh:
        yield "%s\n" % max_item(row)

def process_file(infilename, outfilename):
    infile = open(infilename)
    max_items = max_item_lines(infile)
    outfile = open(outfilename, "w")
    outfile.writelines(max_items)
    outfile.close()

if __name__ == '__main__':
    import sys
    infilename, outfilename = sys.argv[1:]
    process_file(infilename, outfilename)

对于单行，您可以调用：

>>> max_item("A (8) C (4) G (48419) T (2)")
'G'

并且处理完整的文件：

>>> process_file("inputfile.txt", "outputfile.txt")

如果您想要每行最大值的实际 Python 列表，那么您可以使用：

>>> map(max_item, open("inputfile.txt"))

Here's a solution that

uses a regexp to scan all occurrences of an uppercase letter followed by a number in brackets
transforms the string pairs from the regexp with a generator expression into (value,key) tuples
returns the key from the tuple that has the highest value

I also added a main function so that the script can be used as a command line tool to read all lines from one file and the write the key with the highest value for each line to an output file. The program uses iterators, so that it is memory efficient no matter how large the input file is.

import re
KEYVAL = re.compile(r"([A-Z])\s*\((\d+)\)")

def max_item(row):
    return max((int(v),k) for k,v in KEYVAL.findall(row))[1]

def max_item_lines(fh):
    for row in fh:
        yield "%s\n" % max_item(row)

def process_file(infilename, outfilename):
    infile = open(infilename)
    max_items = max_item_lines(infile)
    outfile = open(outfilename, "w")
    outfile.writelines(max_items)
    outfile.close()

if __name__ == '__main__':
    import sys
    infilename, outfilename = sys.argv[1:]
    process_file(infilename, outfilename)

For a single row, you can call:

>>> max_item("A (8) C (4) G (48419) T (2)")
'G'

And to process a complete file:

>>> process_file("inputfile.txt", "outputfile.txt")

If you want an actual Python list of every row's maximum value, then you can use:

>>> map(max_item, open("inputfile.txt"))

回复收藏 0 原文

萧瑟寒风 2024-10-22 23:40:35

max(d.itervalues())

这将比 d.values() 快得多，因为它使用可迭代。

max(d.itervalues())

This will be much faster than say d.values() as it is using an iterable.

回复收藏 0 原文

鸠魁 2024-10-22 23:40:35

请尝试以下操作：

st = "A (8) C (4) G (48419) T (2)" # your start string
a=st.split(")")
b=[x.replace("(","").strip() for x in a if x!=""]
c=[x.split(" ") for x in b]
d=[(int(x[1]),x[0]) for x in c]
max(d) # this is your result.

Try the following:

st = "A (8) C (4) G (48419) T (2)" # your start string
a=st.split(")")
b=[x.replace("(","").strip() for x in a if x!=""]
c=[x.split(" ") for x in b]
d=[(int(x[1]),x[0]) for x in c]
max(d) # this is your result.

回复收藏 0 原文

音栖息无 2024-10-22 23:40:35

使用正则表达式来分割行。然后对于所有匹配的组，你必须将匹配的字符串转换为数字，获取最大值，并找出对应的字母。

import re
r = re.compile('A \((\d+)\) C \((\d+)\) G \((\d+)\) T \((\d+)\)')
for line in my_file:
  m = r.match(line)
  if not m:
    continue # or complain about invalid line
  value, n = max((int(value), n) for (n, value) in enumerate(m.groups()))
  print "ACGT"[n], value

Use regular expressions to split the line. Then for all the matched groups, you have to convert the matched strings to numbers, get the maximum, and figure out the corresponding letter.

import re
r = re.compile('A \((\d+)\) C \((\d+)\) G \((\d+)\) T \((\d+)\)')
for line in my_file:
  m = r.match(line)
  if not m:
    continue # or complain about invalid line
  value, n = max((int(value), n) for (n, value) in enumerate(m.groups()))
  print "ACGT"[n], value

回复收藏 0 原文

晌融 2024-10-22 23:40:35

row = "A (8) C (4) G (48419) T (2)"

lst = row.replace("(",'').replace(")",'').split() # ['A', '8', 'C', '4', 'G', '48419', 'T', '2']

dd = dict(zip(lst[0::2],map(int,lst[1::2]))) # {'A': 8, 'C': 4, 'T': 2, 'G': 48419} 

max(map(lambda k:[dd[k],k], dd))[1] # 'G'

row = "A (8) C (4) G (48419) T (2)"

lst = row.replace("(",'').replace(")",'').split() # ['A', '8', 'C', '4', 'G', '48419', 'T', '2']

dd = dict(zip(lst[0::2],map(int,lst[1::2]))) # {'A': 8, 'C': 4, 'T': 2, 'G': 48419} 

max(map(lambda k:[dd[k],k], dd))[1] # 'G'

回复收藏 0 原文

~没有更多了~