启动法语拼写纠正基于fastText的python脚本时,终端崩溃

发布于 2025-02-07 06:05:45 字数 2508 浏览 0 评论 0原文

试图

根据fastText运行法语拼写校正的Python脚本(来自教程)。

我所做的工作

  • 下载了fastText models (( bin and text)
  • 构建fastText作为命令行工具:
$ git clone https://github.com/facebookresearch/fastText.git
$ cd fastText
$ make
  • 尝试运行下面的代码

script.py

import io
import fasttext

def load_vectors(fname):
    fin = io.open(fname, 'r', encoding='utf-8', newline='\n', errors='ignore')
    n, d = map(int, fin.readline().split())
    data = {}
    for line in fin:
        tokens = line.rstrip().split(' ')
        data[tokens[0]] = map(float, tokens[1:])
    return data

def spelltest(tests, model, vocab):
    "Run correction(wrong) on all (right, wrong) pairs; report results."
    import time
    start = time.clock()
    good, unknown = 0, 0
    n = len(tests)
    for right, wrong in tests:
        w = wrong
        if w in vocab:
            print('word: {} exists in the vocabulary. No correction required'.format(w))
        else:
            w_old = w
            w = model.get_nearest_neighbors(w, k=1)[0][1]
            print("found replacement: {} for word: {}".format(w, w_old))
        good += (w == right)
    dt = time.clock() - start
    print('{:.0%} of {} correct at {:.0f} words per second '
          .format(good / n, n, n / dt))

def Testset(lines):
    "Parse 'right: wrong1 wrong2' lines into [('right', 'wrong1'), ('right', 'wrong2')] pairs."
    return [(right, wrong)
            for (right, wrongs) in (line.split(':') for line in lines)
            for wrong in wrongs.split()]

if __name__ == "__main__":
    model = fasttext.load_model("cc.fr.300.bin")
    vocab = load_vectors("cc.fr.300.vec")
    
    spelltest(Testset(open('Memoires_secrets_09.txt')), model, vocab)
    #spelltest(Testset(open('spell-testset2.txt')), model, vocab)

错误

终端在运行时终端发出警告并崩溃script.py

Warning : `load_model` does not return WordVectorModel or SupervisedModel any more, but a `FastText` object which is very similar.

环境

Ubuntu 22.04 LTS
Python 3.10.4

Background

Trying to run the python script for French spelling correction based on fasttext (from the tutorial here) from the command line.

What I have done

  • downloaded the fasttext models (bin and text)
  • built fasttext as a command-line tool:
$ git clone https://github.com/facebookresearch/fastText.git
$ cd fastText
$ make
  • tried to run the code below

Code

script.py

import io
import fasttext

def load_vectors(fname):
    fin = io.open(fname, 'r', encoding='utf-8', newline='\n', errors='ignore')
    n, d = map(int, fin.readline().split())
    data = {}
    for line in fin:
        tokens = line.rstrip().split(' ')
        data[tokens[0]] = map(float, tokens[1:])
    return data

def spelltest(tests, model, vocab):
    "Run correction(wrong) on all (right, wrong) pairs; report results."
    import time
    start = time.clock()
    good, unknown = 0, 0
    n = len(tests)
    for right, wrong in tests:
        w = wrong
        if w in vocab:
            print('word: {} exists in the vocabulary. No correction required'.format(w))
        else:
            w_old = w
            w = model.get_nearest_neighbors(w, k=1)[0][1]
            print("found replacement: {} for word: {}".format(w, w_old))
        good += (w == right)
    dt = time.clock() - start
    print('{:.0%} of {} correct at {:.0f} words per second '
          .format(good / n, n, n / dt))

def Testset(lines):
    "Parse 'right: wrong1 wrong2' lines into [('right', 'wrong1'), ('right', 'wrong2')] pairs."
    return [(right, wrong)
            for (right, wrongs) in (line.split(':') for line in lines)
            for wrong in wrongs.split()]

if __name__ == "__main__":
    model = fasttext.load_model("cc.fr.300.bin")
    vocab = load_vectors("cc.fr.300.vec")
    
    spelltest(Testset(open('Memoires_secrets_09.txt')), model, vocab)
    #spelltest(Testset(open('spell-testset2.txt')), model, vocab)

Error

The terminal throws a warning and crashes when running script.py.

Warning : `load_model` does not return WordVectorModel or SupervisedModel any more, but a `FastText` object which is very similar.

Environment

Ubuntu 22.04 LTS
Python 3.10.4

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文