在语句中返回发电机
我想通过pandas.read_csv
创建一个包装器函数,以更改默认的分隔符,并以特定方式格式化文件。这是我拥有的代码:
def custom_read(path, sep="|", **kwargs):
if not kwargs.get("chunksize", False):
df_ = pd.read_csv(path, sep=sep, **kwargs)
return format_df(df_, path)
else:
with pd.read_csv(path, sep=sep, **kwargs) as reader:
return (format_df(chunk, path) for chunk in reader)
事实证明,这种segfault在使用时会时:
L = [chunk.iloc[:10, :] for chunk in custom_read(my_file)]
根据我在回溯之外的理解,创建了生成器,然后关闭文件,并且当发电机试图从现在关闭的情况下读取时,就会发生segfault文件。
我可以避免使用次要重构的Segfault:
def custom_read(path, sep="|", **kwargs):
if not kwargs.get("chunksize", False):
df_ = pd.read_csv(path, sep=sep, **kwargs)
return format_df(df_, path)
else:
reader = pd.read_csv(path, sep=sep, **kwargs)
return (format_df(chunk, path) for chunk in reader)
我找不到特定的发电机用户中的任何内容,这是可以避免的吗?这是否应该不起作用,还是某种错误?
有没有办法避免此错误,但仍在使用语句的鼓励使用?
I wanted to create a wrapper function over pandas.read_csv
to change the default separator and format the file a specific way. This is the code I had :
def custom_read(path, sep="|", **kwargs):
if not kwargs.get("chunksize", False):
df_ = pd.read_csv(path, sep=sep, **kwargs)
return format_df(df_, path)
else:
with pd.read_csv(path, sep=sep, **kwargs) as reader:
return (format_df(chunk, path) for chunk in reader)
It turns out that this segfaults when used like so :
L = [chunk.iloc[:10, :] for chunk in custom_read(my_file)]
From what I understood off the backtrace, the generator is created, then the file is closed and the segfault happens when the generator tries to read from the now closed file.
I could avoid the segfault with a minor refactoring :
def custom_read(path, sep="|", **kwargs):
if not kwargs.get("chunksize", False):
df_ = pd.read_csv(path, sep=sep, **kwargs)
return format_df(df_, path)
else:
reader = pd.read_csv(path, sep=sep, **kwargs)
return (format_df(chunk, path) for chunk in reader)
I couldn't find anything on the particular usecase of generators in with clauses, is it something to avoid ? Is this supposed not to work or is this a bug of some kind ?
Is there a way to avoid this error but still use the encouraged with
statement ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用保持文件打开的生成器。请参阅以下示例:
编辑:
在您的
pandas.read_csv
用例中,这看起来像You could use a generator which keeps the file open. See the following example:
Edit:
In your
pandas.read_csv
use case, this would look like