英语词典为 txt 或 xml 文件,支持同义词

发布于 2024-08-29 13:08:49 字数 1539 浏览 7 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

蓝梦月影 2024-09-05 13:08:49

WordNet 就是您想要的。它很大,包含超过十万个条目,并且免费提供。

但是,它不存储为 XML。要访问数据,您需要使用现有的 WordNet API 之一您选择的语言。

使用 API 通常非常简单,因此我认为您不必太担心“学习复杂的 API”。例如,借用 Python 的 WordNet How to基于自然语言工具包 (NLTK)

 >>> from nltk.corpus import wordnet
 >>> 
 >>> # Get All Synsets for 'dog'
 >>> # This is essentially all senses of the word in the db
 >>> wordnet.synsets('dog')
 [Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), 
  Synset('cad.n.01'), Synset('frank.n.02'),Synset('pawl.n.01'), 
  Synset('andiron.n.01'), Synset('chase.v.01')]
 
 >>> # Get the definition and usage for the first synset
 >>> wn.synset('dog.n.01').definition
 'a member of the genus Canis (probably descended from the common 
 wolf) that has been domesticated by man since prehistoric times; 
 occurs in many breeds'
 >>> wn.synset('dog.n.01').examples
 ['the dog barked all night']

 >>> # Get antonyms for 'good'
 >>> wordnet.synset('good.a.01').lemmas[0].antonyms()
 [Lemma('bad.a.01.bad')]

 >>> # Get synonyms for the first noun sense of 'dog'
 >>> wordnet.synset('dog.n.01').lemmas
 [Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'), 
 Lemma('dog.n.01.Canis_familiaris')]

 >>> # Get synonyms for all senses of 'dog'
 >>> for synset in wordnet.synsets('dog'): print synset.lemmas
 [Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'), 
 Lemma('dog.n.01.Canis_familiaris')]
 ...
 [Lemma('frank.n.02.frank'), Lemma('frank.n.02.frankfurter'), 
 ...

虽然 WordNet 中存在美式英语偏差,但它支持英式拼写和用法。例如,您可以查找“color”,“lift”的同义词集之一是“elevator.n.01”。

关于 XML 的注释

如果必须将数据表示为 XML,那么您可以轻松地使用其中一个 API 来访问 WordNet 数据库
并将其转换为 XML,例如,请参阅 Thinking XML:将 WordNet 作为 XML 查询< /a>.

WordNet is what you want. It's big, containing over a hundred thousand entries, and it's freely available.

However, it's not stored as XML. To access the data, you'll want to use one of the existing WordNet APIs for your language of choice.

Using the APIs is generally pretty straightforward, so I don't think you have to worry much about "learning (a) complex API". For example, borrowing from the WordNet How to for the Python based Natural Language Toolkit (NLTK):

 >>> from nltk.corpus import wordnet
 >>> 
 >>> # Get All Synsets for 'dog'
 >>> # This is essentially all senses of the word in the db
 >>> wordnet.synsets('dog')
 [Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), 
  Synset('cad.n.01'), Synset('frank.n.02'),Synset('pawl.n.01'), 
  Synset('andiron.n.01'), Synset('chase.v.01')]
 
 >>> # Get the definition and usage for the first synset
 >>> wn.synset('dog.n.01').definition
 'a member of the genus Canis (probably descended from the common 
 wolf) that has been domesticated by man since prehistoric times; 
 occurs in many breeds'
 >>> wn.synset('dog.n.01').examples
 ['the dog barked all night']

 >>> # Get antonyms for 'good'
 >>> wordnet.synset('good.a.01').lemmas[0].antonyms()
 [Lemma('bad.a.01.bad')]

 >>> # Get synonyms for the first noun sense of 'dog'
 >>> wordnet.synset('dog.n.01').lemmas
 [Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'), 
 Lemma('dog.n.01.Canis_familiaris')]

 >>> # Get synonyms for all senses of 'dog'
 >>> for synset in wordnet.synsets('dog'): print synset.lemmas
 [Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'), 
 Lemma('dog.n.01.Canis_familiaris')]
 ...
 [Lemma('frank.n.02.frank'), Lemma('frank.n.02.frankfurter'), 
 ...

While there is an American English bias in WordNet, it supports British spellings and usage. For example, you can look up 'colour' and one of the synsets for 'lift' is 'elevator.n.01'.

Notes on XML

If having the data represented as XML is essential, you could easily use one of the APIs to access the WordNet database
and convert it into XML, e.g. see Thinking XML: Querying WordNet as XML.

别靠近我心 2024-09-05 13:08:49

我知道这个问题已经很老了,但我自己在查找 txt 文件时遇到了问题,所以如果有人要查找同义词和反义词 txt 文件数据库,这是最简单但非常详细的尝试
https://ia801407.us.archive.org/10/items/synonymsantonyms00ordwiala /synonymsantonyms00ordwiala_djvu.txt

I know this question is quite old but I had problems myself for finding that as a txt file, so if anyone would be looking synonyms and antonyms txt file database the simplest yet very detailed try
https://ia801407.us.archive.org/10/items/synonymsantonyms00ordwiala/synonymsantonyms00ordwiala_djvu.txt .

和我恋爱吧 2024-09-05 13:08:49

我过去曾使用过 Roget 同义词库。它具有纯文本文件中的同义词信息。还有一些 java 代码可以帮助您解析文本。

这些页面提供了一堆同义词库/词汇资源的链接,其中一些可以免费下载。

http://www.w3.org/2001/sw/Europe /reports/thes/thes_links.html

http://www-a2k.is.tokushima-u.ac.jp/member/kita/NLP/lex.html

I have used Roget's thesaurus in the past. It has the synonymy information in plain text files. There is also some java code to help you parse the text.

These pages provides links to a bunch of thesauri/lexical resources some of which are freely downloadable.

http://www.w3.org/2001/sw/Europe/reports/thes/thes_links.html

http://www-a2k.is.tokushima-u.ac.jp/member/kita/NLP/lex.html

孤星 2024-09-05 13:08:49

尝试 WordNet

Try WordNet.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文