句子列表中的维恩图

发布于 2024-08-06 07:02:44 字数 246 浏览 6 评论 0原文

我在 Excel 中的每一行的一列中都有一个包含许多句子的列表。我有大约 3 或更多的专栏有这样的句子。其中有一些常用的句子。是否可以创建一个脚本来创建维恩图并获取所有图之间的共同点。

示例:这些是一栏中的句子。同样,也有不同的列。

来自癌症的血液淋巴细胞

来自患者的血液淋巴细胞

卵巢肿瘤_III级

腹膜肿瘤_IV级

激素抵抗PCA

可以用Python编写脚本吗?

I have a list of many sentences in Excel on each row in a column. I have like 3 or more columns with such sentences. There are some common sentences in these. Is it possible to create a script to create a Venn diagram and get the common ones between all.

Example: These are sentences in a column. Similarly there are different columns.

Blood lymphocytes from cancer

Blood lymphocytes from patients

Ovarian tumor_Grade III

Peritoneum tumor_Grade IV

Hormone resistant PCA

Is it possible to write a script in python?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

掩饰不了的爱 2024-08-13 07:02:44

这是我对问题的解释...

给出数据文件 z.csv (将数据从 Excel 导出到 csv 文件)

"Blood lymphocytes from cancer","Blood lymphocytes from sausages","Ovarian tumor_Grade III"
"Blood lymphocytes from patients","Ovarian tumor_Grade III","Peritoneum tumor_Grade IV"
"Ovarian tumor_Grade III","Peritoneum tumor_Grade IV","Hormone resistant PCA"
"Peritoneum tumor_Grade XV","Hormone resistant PCA","Blood lymphocytes from cancer"
"Hormone resistant PCA",,"Blood lymphocytes from patients"

该程序找到所有列共有的句子

import csv

# Open the csv file
rows = csv.reader(open("z.csv"))

# A list of 3 sets of sentences
results = [set(), set(), set()]

# Read the csv file into the 3 sets
for row in rows:
    for i, data in enumerate(row):
        results[i].add(data)

# Work out the sentences common to all rows
intersection = results[0]
for result in results[1:]:
    intersection = intersection.intersection(result)

print "Common to all rows :-"
for data in intersection:
    print data

它打印这个答案

Common to all rows :-
Hormone resistant PCA
Ovarian tumor_Grade III

不能 100% 确定这是什么您正在寻找,但希望它可以帮助您开始!

它可以很容易地推广到您喜欢的任意多的列,但我不想让它变得更复杂

This is my interpretation of the question...

Give the data file z.csv (export your data from excel into a csv file)

"Blood lymphocytes from cancer","Blood lymphocytes from sausages","Ovarian tumor_Grade III"
"Blood lymphocytes from patients","Ovarian tumor_Grade III","Peritoneum tumor_Grade IV"
"Ovarian tumor_Grade III","Peritoneum tumor_Grade IV","Hormone resistant PCA"
"Peritoneum tumor_Grade XV","Hormone resistant PCA","Blood lymphocytes from cancer"
"Hormone resistant PCA",,"Blood lymphocytes from patients"

This program finds the sentences common to all the columns

import csv

# Open the csv file
rows = csv.reader(open("z.csv"))

# A list of 3 sets of sentences
results = [set(), set(), set()]

# Read the csv file into the 3 sets
for row in rows:
    for i, data in enumerate(row):
        results[i].add(data)

# Work out the sentences common to all rows
intersection = results[0]
for result in results[1:]:
    intersection = intersection.intersection(result)

print "Common to all rows :-"
for data in intersection:
    print data

And it prints this answer

Common to all rows :-
Hormone resistant PCA
Ovarian tumor_Grade III

Not 100% sure that is what you are looking for but hopefully it gets you started!

It could be generalised easily to as many columns as you like, but I didn't want to make it more complicated

榆西 2024-08-13 07:02:44

你的问题不完全清楚,所以我可能误解你在寻找什么。

维恩图只是一些简单的集合运算。 Python 将这些内容内置到 Set 数据类型中。基本上,获取两组项目并使用集合操作(​​例如使用交集来查找公共项目)。

要读取数据,最好的选择可能是将文件保存为 CSV 格式,然后使用字符串 split 方法对其进行解析。

Your question is not fully clear, so I might be misunderstanding what you're looking for.

A Venn diagram is just a few simple Set operations. Python has this stuff built into the Set datatype. Basically, take your two groups of items and use set operations (e.g. use intersection to find the common items).

To read in the data, your best bet is probably to save the file in CSV format and just parse it with the string split method.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文