获取字典中的最大值

发布于 2024-10-15 23:40:35 字数 197 浏览 8 评论 0原文

我面临着这个问题。我的字典中有 10,000 行,这是其中一行

示例:A (8) C (4) G (48419) T (2) 打印出来时

我想得到“G”作为答案,因为它具有最高的价值。

我目前正在使用 Python 2.4,但我不知道如何解决这个问题,因为我对 Python 还很陌生。

非常感谢您提供的任何帮助:)

I'm facing problem with this. I have 10,000 rows in my dictionary and this is one of the rows

Example: A (8) C (4) G (48419) T (2) when printed out

I'd like to get 'G' as an answer, since it has the highest value.

I'm currently using Python 2.4 and I have no idea how to solve this as I'm quite new in Python.

Thanks a lot for any help given :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

失退 2024-10-22 23:40:35

这是一个解决方案,

  1. 使用正则表达式扫描所有出现的大写字母,后跟括号中的数字,
  2. 将正则表达式中的字符串对与生成器表达式转换为 (value,key) 元组,
  3. 从具有最高值的元组中返回键

我还添加了一个 main 函数,以便该脚本可以用作命令行工具来读取一个文件中的所有行,并将每行具有最高值的密钥写入输出文件。该程序使用迭代器,因此无论输入文件有多大,它都具有内存效率。

import re
KEYVAL = re.compile(r"([A-Z])\s*\((\d+)\)")

def max_item(row):
    return max((int(v),k) for k,v in KEYVAL.findall(row))[1]

def max_item_lines(fh):
    for row in fh:
        yield "%s\n" % max_item(row)

def process_file(infilename, outfilename):
    infile = open(infilename)
    max_items = max_item_lines(infile)
    outfile = open(outfilename, "w")
    outfile.writelines(max_items)
    outfile.close()

if __name__ == '__main__':
    import sys
    infilename, outfilename = sys.argv[1:]
    process_file(infilename, outfilename)

对于单行,您可以调用:

>>> max_item("A (8) C (4) G (48419) T (2)")
'G'

并且处理完整的文件:

>>> process_file("inputfile.txt", "outputfile.txt")

如果您想要每行最大值的实际 Python 列表,那么您可以使用:

>>> map(max_item, open("inputfile.txt"))

Here's a solution that

  1. uses a regexp to scan all occurrences of an uppercase letter followed by a number in brackets
  2. transforms the string pairs from the regexp with a generator expression into (value,key) tuples
  3. returns the key from the tuple that has the highest value

I also added a main function so that the script can be used as a command line tool to read all lines from one file and the write the key with the highest value for each line to an output file. The program uses iterators, so that it is memory efficient no matter how large the input file is.

import re
KEYVAL = re.compile(r"([A-Z])\s*\((\d+)\)")

def max_item(row):
    return max((int(v),k) for k,v in KEYVAL.findall(row))[1]

def max_item_lines(fh):
    for row in fh:
        yield "%s\n" % max_item(row)

def process_file(infilename, outfilename):
    infile = open(infilename)
    max_items = max_item_lines(infile)
    outfile = open(outfilename, "w")
    outfile.writelines(max_items)
    outfile.close()

if __name__ == '__main__':
    import sys
    infilename, outfilename = sys.argv[1:]
    process_file(infilename, outfilename)

For a single row, you can call:

>>> max_item("A (8) C (4) G (48419) T (2)")
'G'

And to process a complete file:

>>> process_file("inputfile.txt", "outputfile.txt")

If you want an actual Python list of every row's maximum value, then you can use:

>>> map(max_item, open("inputfile.txt"))
萧瑟寒风 2024-10-22 23:40:35
max(d.itervalues())

这将比 d.values() 快得多,因为它使用可迭代。

max(d.itervalues())

This will be much faster than say d.values() as it is using an iterable.

鸠魁 2024-10-22 23:40:35

请尝试以下操作:

st = "A (8) C (4) G (48419) T (2)" # your start string
a=st.split(")")
b=[x.replace("(","").strip() for x in a if x!=""]
c=[x.split(" ") for x in b]
d=[(int(x[1]),x[0]) for x in c]
max(d) # this is your result.

Try the following:

st = "A (8) C (4) G (48419) T (2)" # your start string
a=st.split(")")
b=[x.replace("(","").strip() for x in a if x!=""]
c=[x.split(" ") for x in b]
d=[(int(x[1]),x[0]) for x in c]
max(d) # this is your result.
音栖息无 2024-10-22 23:40:35

使用正则表达式来分割行。然后对于所有匹配的组,你必须将匹配的字符串转换为数字,获取最大值,并找出对应的字母。

import re
r = re.compile('A \((\d+)\) C \((\d+)\) G \((\d+)\) T \((\d+)\)')
for line in my_file:
  m = r.match(line)
  if not m:
    continue # or complain about invalid line
  value, n = max((int(value), n) for (n, value) in enumerate(m.groups()))
  print "ACGT"[n], value

Use regular expressions to split the line. Then for all the matched groups, you have to convert the matched strings to numbers, get the maximum, and figure out the corresponding letter.

import re
r = re.compile('A \((\d+)\) C \((\d+)\) G \((\d+)\) T \((\d+)\)')
for line in my_file:
  m = r.match(line)
  if not m:
    continue # or complain about invalid line
  value, n = max((int(value), n) for (n, value) in enumerate(m.groups()))
  print "ACGT"[n], value
晌融 2024-10-22 23:40:35
row = "A (8) C (4) G (48419) T (2)"

lst = row.replace("(",'').replace(")",'').split() # ['A', '8', 'C', '4', 'G', '48419', 'T', '2']

dd = dict(zip(lst[0::2],map(int,lst[1::2]))) # {'A': 8, 'C': 4, 'T': 2, 'G': 48419} 

max(map(lambda k:[dd[k],k], dd))[1] # 'G'
row = "A (8) C (4) G (48419) T (2)"

lst = row.replace("(",'').replace(")",'').split() # ['A', '8', 'C', '4', 'G', '48419', 'T', '2']

dd = dict(zip(lst[0::2],map(int,lst[1::2]))) # {'A': 8, 'C': 4, 'T': 2, 'G': 48419} 

max(map(lambda k:[dd[k],k], dd))[1] # 'G'
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文