当前位置：文江博客话题详情

从医学期刊创建单词列表

发布于 2025-02-06 08:43:13 字数 520 浏览 0 评论 0原文

我被要求编译外科医生出版物的填字游戏 - 每季度出现。我需要使其以医学为导向，最好使用不同的专业单词。例如，有些将是骨科，一些心脏手术和一些人体解剖学等。我可以在网上获得外科期刊。

我想为每个专业创建单词列表，并在编译器中使用它们。我将使用填字游戏编译器。

我可以在网络上使用期刊文章，也可以下载PDF。我是一名外科医生，并使用熊猫进行数据分析，但是我的python技能有点原始，因此我需要相对简单的解决方案。如何为每个外科专业创建特定单词列表。

它们不需要非常具体的单词，因此，例如，我认为我可以将期刊卷刮掉单词，将它们与常用单词列表进行比较，并删除那些让我列出技术列表的人。可能需要一些反复试验。我以前没有用过美丽的汤，但愿意尝试。

另外，我可以摆脱美丽的汤步骤，并使用endnote下载几百个期刊并导出到TXT。

这是我认为我主要在努力概念化的提取和清单。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

叹沉浮 2025-02-13 08:43:13

我创建了这个程序，您可以用来通过.txt文件解析以找到最常见的单词。我还提供了一个代码块，该代码将帮助您将.pdf文件转换为.txt。希望我对解决方案的方法有所帮助，祝外科医生出版物的填字游戏好运！

'''
Find the most common words in a txt file
'''

import collections
# The re module provides regular expression matching operations
import re
'''
Use this if you would like to convert a PDF to a txt file
'''
# import PyPDF2
# pdffileobj=open('textFileName.pdf','rb')
# pdfreader=PyPDF2.PdfFileReader(pdffileobj)
# x=pdfreader.numPages
# pageobj=pdfreader.getPage(x-1)
# text=pageobj.extractText()

# file1=open(r"(folder path)\\textFileName.txt","a")
# file1.writelines(text)
# file1.close()

words = re.findall(r'\w+', open('textFileName.txt').read().lower())
most_common = collections.Counter(words).most_common(10)
print(most_common)

I created this program that you can use to parse through a .txt file to find the most common words. I also included a block of code that will help you to convert a .pdf file to .txt. Hope my approach to the solution helps, good luck with your crossword for the surgeon's publication!

'''
Find the most common words in a txt file
'''

import collections
# The re module provides regular expression matching operations
import re
'''
Use this if you would like to convert a PDF to a txt file
'''
# import PyPDF2
# pdffileobj=open('textFileName.pdf','rb')
# pdfreader=PyPDF2.PdfFileReader(pdffileobj)
# x=pdfreader.numPages
# pageobj=pdfreader.getPage(x-1)
# text=pageobj.extractText()

# file1=open(r"(folder path)\\textFileName.txt","a")
# file1.writelines(text)
# file1.close()

words = re.findall(r'\w+', open('textFileName.txt').read().lower())
most_common = collections.Counter(words).most_common(10)
print(most_common)

回复收藏 0 原文

~没有更多了~