将 bar 放在包含 foo 的每一行的末尾

发布于 2024-08-06 06:15:45 字数 406 浏览 7 评论 0原文

我有一个包含大量行的列表,每行都采用主语-动词-宾语形式,例如:

Jane likes Fred
Chris dislikes Joe
Nate knows Jill

要绘制一个网络图来表达有向颜色编码边中节点之间的不同关系,我需要替换动词用箭头并在每行末尾放置一个颜色代码,因此,有些简化:

Jane -> Fred red;
Chris -> Joe blue;
Nate -> Jill black;

只有少量动词,因此用箭头替换它们只需几个搜索和替换命令即可。然而,在此之前,我需要在每行的末尾放置一个与该行的动词相对应的颜色代码。我想使用 Python 来完成此操作。

这些是我在编程中的初步步骤,因此请明确并包含在文本文件中读取的代码。

感谢您的帮助!

I have a list with a large number of lines, each taking the subject-verb-object form, eg:

Jane likes Fred
Chris dislikes Joe
Nate knows Jill

To plot a network graph that expresses the different relationships between the nodes in directed color-coded edges, I will need to replace the verb with an arrow and place a color code at the end of each line, thus, somewhat simplified:

Jane -> Fred red;
Chris -> Joe blue;
Nate -> Jill black;

There's only a small number of verbs, so replacing them with an arrow is just a matter of a few search and replace commands. Before doing that, however, I will need to put a color code at the end of every line that corresponds to the line's verb. I'd like to do this using Python.

These are my baby steps in programming, so please be explicit and include the code that reads in the text file.

Thanks for your help!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

别想她 2024-08-13 06:15:45

听起来您想要研究 字典字符串格式。一般来说,如果您需要编程帮助,只需将遇到的任何问题分解为非常小的、离散的块,独立搜索这些块,然后您应该能够将其全部表述为更大的答案。 Stack Overflow 是此类搜索的绝佳资源。

另外,如果您对 Python 有任何一般性的好奇,请搜索或浏览官方 Python 文档。如果您发现自己总是不知道从哪里开始,请阅读 Python 教程 或查找书要经过。花一两周的时间来获得对正在做的事情的良好基础知识的投资将在您完成工作时一次又一次地得到回报。

verb_color_map = {
    'likes': 'red',
    'dislikes': 'blue',
    'knows': 'black',
}

with open('infile.txt') as infile: # assuming you've stored your data in 'infile.txt'
    for line in infile:
        # Python uses the name object, so I use object_
        subject, verb, object_ = line.split()
        print "%s -> %s %s;" % (subject, object_, verb_color_map[verb])

It sounds like you will want to research dictionaries and string formatting. In general, if you need help programming, just break down any problem you have into extremely small, discrete chunks, search those chunks independently, and then you should be able to formulate it all into a larger answer. Stack Overflow is a great resource for this type of searching.

Also, if you have any general curiosities about Python, search or browse the official Python documentation. If you find yourself constantly not knowing where to begin, read the Python tutorial or find a book to go through. A week or two investment to get a good foundational knowledge of what you are doing will pay off over and over again as you complete work.

verb_color_map = {
    'likes': 'red',
    'dislikes': 'blue',
    'knows': 'black',
}

with open('infile.txt') as infile: # assuming you've stored your data in 'infile.txt'
    for line in infile:
        # Python uses the name object, so I use object_
        subject, verb, object_ = line.split()
        print "%s -> %s %s;" % (subject, object_, verb_color_map[verb])
稀香 2024-08-13 06:15:45

足够简单;假设动词列表是固定的并且很小,使用字典和 for 循环很容易做到这一点:

VERBS = {
    "likes": "red"
  , "dislikes": "blue"
  , "knows": "black"
  }

def replace_verb (line):
    for verb, color in VERBS.items():
        if verb in line:
            return "%s %s;" % (
                  line.replace (verb, "->")
                , color
                )
    return line

def main ():
    filename = "my_file.txt"
    with open (filename, "r") as fp:
        for line in fp:
            print replace_verb (line)

# Allow the module to be executed directly on the command line
if __name__ == "__main__":
    main ()

Simple enough; assuming the lists of verbs is fixed and small, this is easy to do with a dictionary and for loop:

VERBS = {
    "likes": "red"
  , "dislikes": "blue"
  , "knows": "black"
  }

def replace_verb (line):
    for verb, color in VERBS.items():
        if verb in line:
            return "%s %s;" % (
                  line.replace (verb, "->")
                , color
                )
    return line

def main ():
    filename = "my_file.txt"
    with open (filename, "r") as fp:
        for line in fp:
            print replace_verb (line)

# Allow the module to be executed directly on the command line
if __name__ == "__main__":
    main ()
魂牵梦绕锁你心扉 2024-08-13 06:15:45
verbs = {"dislikes":"blue", "knows":"black", "likes":"red"}
for s in open("/tmp/infile"):
  s = s.strip()
  for verb in verbs.keys():
    if (s.count(verb) > 0):
      print s.replace(verb,"->")+" "+verbs[verb]+";"
      break

编辑:而是使用“for s in open”

verbs = {"dislikes":"blue", "knows":"black", "likes":"red"}
for s in open("/tmp/infile"):
  s = s.strip()
  for verb in verbs.keys():
    if (s.count(verb) > 0):
      print s.replace(verb,"->")+" "+verbs[verb]+";"
      break

Edit: Rather use "for s in open"

情话墙 2024-08-13 06:15:45

你确定这不是一点家庭作业吗:) 如果是这样,那就坦白吧。无需考虑太多细节,请考虑您要执行的任务:

对于每一行:

  1. 读取它,
  2. 将其拆分为单词(在空白上 - .split() )
  3. 将中间单词转换为颜色(基于映射 - > cf: python dict()
  4. 打印第一个单词、箭头、第三个单词和颜色代码

使用 NetworkX (networkx.lanl.gov/)

'''
plot relationships in a social network
'''

import networkx
## make a fake file 'ex.txt' in this directory
## then write fake relationships to it.
example_relationships = file('ex.txt','w') 
print >> example_relationships, '''\
Jane Doe likes Fred
Chris dislikes Joe
Nate knows Jill \
'''
example_relationships.close()

rel_colors = {
    'likes':  'blue',
    'dislikes' : 'black',
    'knows'   : 'green',
}

def split_on_verb(sentence):
    ''' we know the verb is the only lower cased word

    >>> split_on_verb("Jane Doe likes Fred")
    ('Jane Does','Fred','likes')

    '''
    words = sentence.strip().split()  # take off any outside whitespace, then split
                                       # on whitespace
    if not words:
        return None  # if there aren't any words, just return nothing

    verbs = [x for x in words if x.islower()]
    verb = verbs[0]  # we want the '1st' one (python numbers from 0,1,2...)
    verb_index = words.index(verb) # where is the verb?
    subject = ' '.join(words[:verb_index])
    obj =  ' '.join(words[(verb_index+1):])  # 'object' is already used in python
    return (subject, obj, verb)


def graph_from_relationships(fh,color_dict):
    '''
    fh:  a filehandle, i.e., an opened file, from which we can read lines
        and loop over
    '''
    G = networkx.DiGraph()

    for line in fh:
        if not line.strip():  continue # move on to the next line,
                                         # if our line is empty-ish
        (subj,obj,verb) = split_on_verb(line)
        color = color_dict[verb]
        # cf: python 'string templates', there are other solutions here
        # this is the 
        print "'%s' -> '%s' [color='%s'];" % (subj,obj,color)
        G.add_edge(subj,obj,color)
        # 

    return G

G = graph_from_relationships(file('ex.txt'),rel_colors)
print G.edges()
# from here you can use the various networkx plotting tools on G, as you're inclined.

Are you sure this isn't a little homeworky :) If so, it's okay to fess up. Without going into too much detail, think about the tasks you're trying to do:

For each line:

  1. read it
  2. split it into words (on whitespace - .split() )
  3. convert the middle word into a color (based on a mapping -> cf: python dict()
  4. print the first word, arrow, third word and the color

Code using NetworkX (networkx.lanl.gov/)

'''
plot relationships in a social network
'''

import networkx
## make a fake file 'ex.txt' in this directory
## then write fake relationships to it.
example_relationships = file('ex.txt','w') 
print >> example_relationships, '''\
Jane Doe likes Fred
Chris dislikes Joe
Nate knows Jill \
'''
example_relationships.close()

rel_colors = {
    'likes':  'blue',
    'dislikes' : 'black',
    'knows'   : 'green',
}

def split_on_verb(sentence):
    ''' we know the verb is the only lower cased word

    >>> split_on_verb("Jane Doe likes Fred")
    ('Jane Does','Fred','likes')

    '''
    words = sentence.strip().split()  # take off any outside whitespace, then split
                                       # on whitespace
    if not words:
        return None  # if there aren't any words, just return nothing

    verbs = [x for x in words if x.islower()]
    verb = verbs[0]  # we want the '1st' one (python numbers from 0,1,2...)
    verb_index = words.index(verb) # where is the verb?
    subject = ' '.join(words[:verb_index])
    obj =  ' '.join(words[(verb_index+1):])  # 'object' is already used in python
    return (subject, obj, verb)


def graph_from_relationships(fh,color_dict):
    '''
    fh:  a filehandle, i.e., an opened file, from which we can read lines
        and loop over
    '''
    G = networkx.DiGraph()

    for line in fh:
        if not line.strip():  continue # move on to the next line,
                                         # if our line is empty-ish
        (subj,obj,verb) = split_on_verb(line)
        color = color_dict[verb]
        # cf: python 'string templates', there are other solutions here
        # this is the 
        print "'%s' -> '%s' [color='%s'];" % (subj,obj,color)
        G.add_edge(subj,obj,color)
        # 

    return G

G = graph_from_relationships(file('ex.txt'),rel_colors)
print G.edges()
# from here you can use the various networkx plotting tools on G, as you're inclined.
甜心 2024-08-13 06:15:45

Python 2.5:

import sys
from collections import defaultdict

codes = defaultdict(lambda: ("---", "Missing action!"))
codes["likes"] =    ("-->", "red")
codes["dislikes"] = ("-/>", "green")
codes["loves"] =    ("==>", "blue")

for line in sys.stdin:
    subject, verb, object_ = line.strip().split(" ")
    arrow, color = codes[verb]
    print subject, arrow, object_, color, ";"

Python 2.5:

import sys
from collections import defaultdict

codes = defaultdict(lambda: ("---", "Missing action!"))
codes["likes"] =    ("-->", "red")
codes["dislikes"] = ("-/>", "green")
codes["loves"] =    ("==>", "blue")

for line in sys.stdin:
    subject, verb, object_ = line.strip().split(" ")
    arrow, color = codes[verb]
    print subject, arrow, object_, color, ";"
合约呢 2024-08-13 06:15:45

除了这个问题之外,卡尔土还说道(在对一个答案的评论中):“在实际输入中,主语和宾语在一到两个单词之间发生不可预测的变化。”

好的,这就是我解决这个问题的方法。

color_map = \
{
    "likes" : "red",
    "dislikes" : "blue",
    "knows" : "black",
}

def is_verb(word):
    return word in color_map

def make_noun(lst):
    if not lst:
        return "--NONE--"
    elif len(lst) == 1:
        return lst[0]
    else:
        return "_".join(lst)


for line in open("filename").readlines():
    words = line.split()
    # subject could be one or two words
    if is_verb(words[1]):
        # subject was one word
        s = words[0]
        v = words[1]
        o = make_noun(words[2:])
    else:
        # subject was two words
        assert is_verb(words[2])
        s = make_noun(words[0:2])
        v = words[2]
        o = make_noun(words[3:])
    color = color_map[v]
    print "%s -> %s %s;" % (s, o, color)

一些注意事项:

0)我们实际上并不需要“with”来解决这个问题,并且以这种方式编写它可以使程序更容易移植到旧版本的Python。我认为这应该适用于 Python 2.2 及更高版本(我只在 Python 2.6 上测试过)。

1)您可以更改 make_noun() 以采用您认为对处理多个单词有用的任何策略。我展示的只是用下划线将它们链接在一起,但是你可以有一本包含形容词的字典,然后把它们扔掉,有一本名​​词字典,然后选择它们,或者其他什么。

2)您还可以使用正则表达式进行模糊匹配。您可以拥有一个元组列表,其中包含与替换颜色配对的正则表达式,然后当正则表达式匹配时,替换颜色,而不是简单地使用 color_map 字典。

In addition to the question, Karasu also said (in a comment on one answer): "In the actual input both subjects and objects vary unpredictably between one and two words."

Okay, here's how I would solve this.

color_map = \
{
    "likes" : "red",
    "dislikes" : "blue",
    "knows" : "black",
}

def is_verb(word):
    return word in color_map

def make_noun(lst):
    if not lst:
        return "--NONE--"
    elif len(lst) == 1:
        return lst[0]
    else:
        return "_".join(lst)


for line in open("filename").readlines():
    words = line.split()
    # subject could be one or two words
    if is_verb(words[1]):
        # subject was one word
        s = words[0]
        v = words[1]
        o = make_noun(words[2:])
    else:
        # subject was two words
        assert is_verb(words[2])
        s = make_noun(words[0:2])
        v = words[2]
        o = make_noun(words[3:])
    color = color_map[v]
    print "%s -> %s %s;" % (s, o, color)

Some notes:

0) We don't really need "with" for this problem, and writing it this way makes the program more portable to older versions of Python. This should work on Python 2.2 and newer, I think (I only tested on Python 2.6).

1) You can change make_noun() to have whatever strategy you deem useful for handling multiple words. I showed just chaining them together with underscores, but you could have a dictionary with adjectives and throw those out, have a dictionary of nouns and choose those, or whatever.

2) You could also use regular expressions for fuzzier matching. Instead of simply using a dictionary for color_map you could have a list of tuples, with a regular expression paired with the replacement color, and then when the regular expression matches, replace the color.

不气馁 2024-08-13 06:15:45

这是我之前答案的改进版本。这个使用正则表达式匹配来对动词进行模糊匹配。这些都有效:

Steve loves Denise
Bears love honey
Maria interested Anders
Maria interests Anders

正则表达式模式“loves?”匹配“love”加上可选的“s”。模式“interest.*”匹配“interest”加上任何内容。如果任一替代项匹配,则具有由竖线分隔的多个替代项的模式匹配。

import re

re_map = \
[
    ("likes?|loves?|interest.*", "red"),
    ("dislikes?|hates?", "blue"),
    ("knows?|tolerates?|ignores?", "black"),
]

# compile the regular expressions one time, then use many times
pat_map = [(re.compile(s), color) for s, color in re_map]

# We dont use is_verb() in this version, but here it is.
# A word is a verb if any of the patterns match.
def is_verb(word):
    return any(pat.match(word) for pat, color in pat_map)

# Return color from matched verb, or None if no match.
# This detects whether a word is a verb, and looks up the color, at the same time.
def color_from_verb(word):
    for pat, color in pat_map:
        if pat.match(word):
            return color
    return None

def make_noun(lst):
    if not lst:
        return "--NONE--"
    elif len(lst) == 1:
        return lst[0]
    else:
        return "_".join(lst)


for line in open("filename"):
    words = line.split()
    # subject could be one or two words
    color = color_from_verb(words[1])
    if color:
        # subject was one word
        s = words[0]
        o = make_noun(words[2:])
    else:
        # subject was two words
        color = color_from_verb(words[1])
        assert color
        s = make_noun(words[0:2])
        o = make_noun(words[3:])
    print "%s -> %s %s;" % (s, o, color)

我希望清楚如何接受这个答案并扩展它。您可以轻松添加更多模式来匹配更多动词。您可以添加逻辑来检测“是”和“在”并丢弃它们,以便“安德斯对玛丽亚感兴趣”会匹配。等等。

如果您有任何疑问,我很乐意进一步解释。祝你好运。

Here is an improved version of my previous answer. This one uses regular expression matching to make a fuzzy match on the verb. These all work:

Steve loves Denise
Bears love honey
Maria interested Anders
Maria interests Anders

The regular expression pattern "loves?" matches "love" plus an optional 's'. The pattern "interest.*" matches "interest" plus anything. Patterns with multiple alternatives separated by vertical bars match if any one of the alternatives matches.

import re

re_map = \
[
    ("likes?|loves?|interest.*", "red"),
    ("dislikes?|hates?", "blue"),
    ("knows?|tolerates?|ignores?", "black"),
]

# compile the regular expressions one time, then use many times
pat_map = [(re.compile(s), color) for s, color in re_map]

# We dont use is_verb() in this version, but here it is.
# A word is a verb if any of the patterns match.
def is_verb(word):
    return any(pat.match(word) for pat, color in pat_map)

# Return color from matched verb, or None if no match.
# This detects whether a word is a verb, and looks up the color, at the same time.
def color_from_verb(word):
    for pat, color in pat_map:
        if pat.match(word):
            return color
    return None

def make_noun(lst):
    if not lst:
        return "--NONE--"
    elif len(lst) == 1:
        return lst[0]
    else:
        return "_".join(lst)


for line in open("filename"):
    words = line.split()
    # subject could be one or two words
    color = color_from_verb(words[1])
    if color:
        # subject was one word
        s = words[0]
        o = make_noun(words[2:])
    else:
        # subject was two words
        color = color_from_verb(words[1])
        assert color
        s = make_noun(words[0:2])
        o = make_noun(words[3:])
    print "%s -> %s %s;" % (s, o, color)

I hope it is clear how to take this answer and extend it. You can easily add more patterns to match more verbs. You could add logic to detect "is" and "in" and discard them, so that "Anders is interested in Maria" would match. And so on.

If you have any questions, I'd be happy to explain this further. Good luck.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文