python文件中的错误,但没有jupyter笔记本文件
我有一个怪异的东西,我似乎无法陷入困境。我正在为NLP项目测试一个新的Lemmatizer,并且在我使用的测试jupyter中效果很好,但是一旦我将其复制到.py文件进行生产,它就会引起停止。有什么提示或建议看哪里?我花了很长时间来尝试进行工作,这一切无济于事。我对这两者都使用了完全相同的测试数据集,因此它在数据框架上没有差异,两者均使用相同的环境,所有代码都是完全相同的。
提前致谢!
这是函数:
def prepareStringTEST(x):
error = 'Error'
x = re.sub(r"[^0-9a-z]", " ", x)
if len(x)==0:
return ''
return " ".join([lemma(wd) for wd in x.split()])
这是被调用的方式:
df['text_cleaned_test'] = df['text'].apply(lambda x: prepareStringTEST(x))
以下是错误消息:
Traceback (most recent call last):
File "C:\Users\xxx\AppData\Roaming\Python\Python39\site-packages\pattern\text\__init__.py", line 609, in _read
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "z:\CEC Python\NLP\clean_raw_text_new.py", line 138, in <module>
df['text_cleaned_test'] = df['text'].apply(lambda x: prepareStringTEST(x))
File "C:\Program Files\Python39\lib\site-packages\pandas\core\series.py", line 4138, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas\_libs\lib.pyx", line 2467, in pandas._libs.lib.map_infer
File "z:\CEC Python\NLP\clean_raw_text_new.py", line 138, in <lambda>
df['text_cleaned_test'] = df['text'].apply(lambda x: prepareStringTEST(x))
File "z:\CEC Python\NLP\clean_raw_text_new.py", line 75, in prepareStringTEST
return " ".join([lemma(wd) for wd in x.split()])
File "z:\CEC Python\NLP\clean_raw_text_new.py", line 75, in <listcomp>
return " ".join([lemma(wd) for wd in x.split()])
File "C:\Users\xxx\AppData\Roaming\Python\Python39\site-packages\pattern\text\__init__.py", line 2172, in lemma
self.load()
File "C:\Users\xxx\AppData\Roaming\Python\Python39\site-packages\pattern\text\__init__.py", line 2127, in load
for v in _read(self._path):
RuntimeError: generator raised StopIteration
这是要测试的代码:
def prepareStringTEST(x):
error = 'Error'
x = re.sub(r"[^0-9a-z]", " ", x)
if len(x)==0:
return ''
return " ".join([lemma(wd) for wd in x.split()])
string = ''''
Peter Navarro, who as a White House adviser to President Donald J. Trump worked to keep Mr. Trump in office after his defeat in the 2020 election, disclosed on Monday that he has been summoned to testify on Thursday to a federal grand jury and to provide prosecutors with any records he has related to the attack on the Capitol last year, including “any communications” with Mr. Trump.
The subpoena to Mr. Navarro — which he said the F.B.I. served at his house last week — seeks his testimony about materials related to the buildup to the Jan. 6 attack on the Capitol, and signals that the Justice Department investigation may be progressing to include activities of people in the White House.
Mr. Navarro revealed the existence of the subpoena in a draft of a lawsuit he said he is preparing to file against the House committee investigating the Jan. 6 attack, Speaker Nancy Pelosi and Matthew M. Graves, the U.S. attorney for the District of Columbia.
'''
print(prepareStringTEST(string))
这是我在jupyter中的结果(在VS代码中):
peter navarro who a a white house adviser to president donald j trump work to keep mr trump in office after hi defeat in the 2020 election disclose on monday that he have be summons to testify on thursday to a federal grand jury and to provide prosecutor with any record he have relate to the attack on the capitol last year include any communication with mr trump the subpoena to mr navarro which he say the f b i serve at hi house last week seek hi testimony about material relate to the buildup to the jan 6 attack on the capitol and signal that the justice department investigation may be progress to include activity of people in the white house mr navarro reveal the existence of the subpoena in a draft of a lawsuit he say he be prepare to file against the house committee investigate the jan 6 attack speaker nancy pelosi and matthew m grave the u attorney for the district of columbia
这是我的结果.py文件(在VS代码中)
PS Z:\CEC Python> & "C:/Program Files/Python39/python.exe" "z:/CEC Python/NLP/clean_raw_test_new.py"
Traceback (most recent call last):
File "C:\Users\mkzou183\AppData\Roaming\Python\Python39\site-packages\pattern\text\__init__.py", line 609, in _read
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "z:\CEC Python\NLP\clean_raw_test_new.py", line 31, in <module>
print(prepareStringTEST(string.lower()))
File "z:\CEC Python\NLP\clean_raw_test_new.py", line 22, in prepareStringTEST
return " ".join([lemma(wd) for wd in x.split()])
File "z:\CEC Python\NLP\clean_raw_test_new.py", line 22, in <listcomp>
return " ".join([lemma(wd) for wd in x.split()])
File "C:\Users\mkzou183\AppData\Roaming\Python\Python39\site-packages\pattern\text\__init__.py", line 2172, in lemma
self.load()
File "C:\Users\mkzou183\AppData\Roaming\Python\Python39\site-packages\pattern\text\__init__.py", line 2127, in load
for v in _read(self._path):
RuntimeError: generator raised StopIteration
I have a bit of a weird one I can't seem to get to the bottom of. I am testing a new lemmatizer for an NLP project, and it works great in the test Jupyter I was using, but as soon as I copy it over to a .py file for production, it raises a StopIteration. Any tips or suggestions on where to look? I have spent far too long trying to produce work arounds, all to no avail. I am using the exact same test dataset for both, so it is not a difference in data frames, both are using the same environment, and ALL code is the exact same.
Thanks in advance!
Here is the function:
def prepareStringTEST(x):
error = 'Error'
x = re.sub(r"[^0-9a-z]", " ", x)
if len(x)==0:
return ''
return " ".join([lemma(wd) for wd in x.split()])
and here is how it is being called:
df['text_cleaned_test'] = df['text'].apply(lambda x: prepareStringTEST(x))
Here is the error message:
Traceback (most recent call last):
File "C:\Users\xxx\AppData\Roaming\Python\Python39\site-packages\pattern\text\__init__.py", line 609, in _read
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "z:\CEC Python\NLP\clean_raw_text_new.py", line 138, in <module>
df['text_cleaned_test'] = df['text'].apply(lambda x: prepareStringTEST(x))
File "C:\Program Files\Python39\lib\site-packages\pandas\core\series.py", line 4138, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas\_libs\lib.pyx", line 2467, in pandas._libs.lib.map_infer
File "z:\CEC Python\NLP\clean_raw_text_new.py", line 138, in <lambda>
df['text_cleaned_test'] = df['text'].apply(lambda x: prepareStringTEST(x))
File "z:\CEC Python\NLP\clean_raw_text_new.py", line 75, in prepareStringTEST
return " ".join([lemma(wd) for wd in x.split()])
File "z:\CEC Python\NLP\clean_raw_text_new.py", line 75, in <listcomp>
return " ".join([lemma(wd) for wd in x.split()])
File "C:\Users\xxx\AppData\Roaming\Python\Python39\site-packages\pattern\text\__init__.py", line 2172, in lemma
self.load()
File "C:\Users\xxx\AppData\Roaming\Python\Python39\site-packages\pattern\text\__init__.py", line 2127, in load
for v in _read(self._path):
RuntimeError: generator raised StopIteration
Here is some code to test:
def prepareStringTEST(x):
error = 'Error'
x = re.sub(r"[^0-9a-z]", " ", x)
if len(x)==0:
return ''
return " ".join([lemma(wd) for wd in x.split()])
string = ''''
Peter Navarro, who as a White House adviser to President Donald J. Trump worked to keep Mr. Trump in office after his defeat in the 2020 election, disclosed on Monday that he has been summoned to testify on Thursday to a federal grand jury and to provide prosecutors with any records he has related to the attack on the Capitol last year, including “any communications” with Mr. Trump.
The subpoena to Mr. Navarro — which he said the F.B.I. served at his house last week — seeks his testimony about materials related to the buildup to the Jan. 6 attack on the Capitol, and signals that the Justice Department investigation may be progressing to include activities of people in the White House.
Mr. Navarro revealed the existence of the subpoena in a draft of a lawsuit he said he is preparing to file against the House committee investigating the Jan. 6 attack, Speaker Nancy Pelosi and Matthew M. Graves, the U.S. attorney for the District of Columbia.
'''
print(prepareStringTEST(string))
Here are my results in Jupyter (in VS code):
peter navarro who a a white house adviser to president donald j trump work to keep mr trump in office after hi defeat in the 2020 election disclose on monday that he have be summons to testify on thursday to a federal grand jury and to provide prosecutor with any record he have relate to the attack on the capitol last year include any communication with mr trump the subpoena to mr navarro which he say the f b i serve at hi house last week seek hi testimony about material relate to the buildup to the jan 6 attack on the capitol and signal that the justice department investigation may be progress to include activity of people in the white house mr navarro reveal the existence of the subpoena in a draft of a lawsuit he say he be prepare to file against the house committee investigate the jan 6 attack speaker nancy pelosi and matthew m grave the u attorney for the district of columbia
Here are my results running the exact same code in a .py file (in VS code)
PS Z:\CEC Python> & "C:/Program Files/Python39/python.exe" "z:/CEC Python/NLP/clean_raw_test_new.py"
Traceback (most recent call last):
File "C:\Users\mkzou183\AppData\Roaming\Python\Python39\site-packages\pattern\text\__init__.py", line 609, in _read
raise StopIteration
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "z:\CEC Python\NLP\clean_raw_test_new.py", line 31, in <module>
print(prepareStringTEST(string.lower()))
File "z:\CEC Python\NLP\clean_raw_test_new.py", line 22, in prepareStringTEST
return " ".join([lemma(wd) for wd in x.split()])
File "z:\CEC Python\NLP\clean_raw_test_new.py", line 22, in <listcomp>
return " ".join([lemma(wd) for wd in x.split()])
File "C:\Users\mkzou183\AppData\Roaming\Python\Python39\site-packages\pattern\text\__init__.py", line 2172, in lemma
self.load()
File "C:\Users\mkzou183\AppData\Roaming\Python\Python39\site-packages\pattern\text\__init__.py", line 2127, in load
for v in _read(self._path):
RuntimeError: generator raised StopIteration
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论