避免使用 Telegraf 和文件输入插件多次读取同一文件
我需要读取文件夹内的 csv 文件。每次用户提交表单时都会生成新的 csv 文件。我正在使用“文件”输入插件来读取数据并将其发送到 Influxdb。这些步骤运行良好。
问题在于每个数据收集间隔都会多次读取同一文件。我正在考虑一种解决方案,可以将读取的文件移动到不同的文件夹,但我无法使用 Telegraf 的“exec”输出插件来做到这一点。
ps:我无法更改 csv 文件的生成方式。
关于如何避免多次读取同一个 csv 文件有什么想法吗?
I need to read csv files inside a folder. New csv files are generated every time a user submits a form. I'm using the "file" input plugin to read the data and send it to Influxdb. These steps are working fine.
The problem is that the same file is read multiple times every data collection interval. I was thinking of a solution where I could move the file that was read to a different folder, but I couldn't do that with Telegraf's "exec" output plug.
ps: I can't change the way csv files are generated.
Any ideas on how to avoid reading the same csv file multiple times?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
正如您发现的,文件输入插件用于在每个收集间隔读取整个文件。
我的建议是您改为使用 目录监视器 输入插件。这将读取目录中的文件,监视目录中的新文件,并解析尚未获取的文件。该插件中有一些配置设置,可以更轻松地计算读取新文件的时间。
另一种选择是使用 tail 输入插件,它将尾部一个文件,并且仅在出现时读取该文件的新更新。但是,我认为目录监视器更可能是您针对您的场景所追求的。
谢谢!
As you discovered file input plugin is used to read entire files at each collection interval.
My suggestion is for you to instead use the directory monitor input plugin. This will read files in a directory, monitor the directory for new files, and parse the ones that have not already been picked up yet. There are some configuration settings in that plugin that make it easier to time when new files are read as well.
Another option is to use the tail input plugin which will tail a file and only read new updates to that file as things come. However, I think the directory monitor is more likely something you are after for your scenario.
Thanks!