Data Fusion 中 SFTP 插件的文件路径过滤器
我正在使用 GC Data Fusion 中的 SFTP 插件从支付网关操作员下载财务报告,这些报告以 csv 文件形式存储在服务器上。
我试图避免上传所有文件以节省一些数据处理成本,结果发现该插件提供了类似“文件路径过滤器”的功能,如下所示:
文档中没有关于如何使用此属性的具体示例,而且我一直在尝试的任何方法都不起作用
文件的存储和命名如下:
SettlementInvoice GS Companyname Limited 190024000478 20220315.csv
所以我的假设是我可以使用:
/20220315/g
然后我可以将 20220315
的宏与来自另一个源的当前日期一起使用(插件支持宏)。
但是,经过测试后它不起作用,并且插件会忽略我在过滤器中输入的任何内容。这可能是由于语法造成的,但我找不到任何如何正确使用此选项的示例。如果有任何关于此事的建议,我将不胜感激。
I'm using the SFTP plugin in GC Data Fusion to download financial reports from the payment gateway opearator that are being stored as csv files on a server.
I'm trying to avoid uploading all the files to save some costs for data processing and it turned out that the plugin offers something like 'File Path Filter' that looks like this:
There's no specific example on how to use this property in the documentation and nothing I've been trying actually works.
Files are stored and named like the following:
SettlementInvoice GS Companyname Limited 190024000478 20220315.csv
So my assumption was that I could use:
/20220315/g
Where I could then use macro for 20220315
with the current date from another source (plugin supports macros).
However, after testing it doesn't work and the plugin ignores whatever I input in the filter. This is probably due to the syntax but I can't find any example of how to properly use this option. I would appreciate any suggestion on the matter.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我能够找到可用于过滤文件的模式。该模式是使用规则 我在评论中提到过。
我使用下面的 Java 程序来测试该模式,然后使用 来自 CDAP Hub 的 SFTP 操作插件。该管道将文件从 SFTP 源复制到 GCS。
使用的正则表达式是
[a-zA-Z0-9 ]*${file.name}[.az]*
,它根据中提供的日期过滤文件宏。数据融合管道 JSON 可以在下面找到。要测试此管道,您可以导入 JSON 文件并根据需要编辑文件系统属性。Data Fusion 实例版本为
6.4.1
。注意:根据这个JIRA,建议使用操作插件而不是源插件,因为 SFTP/FTP 源插件将来会被删除。
I was able to find the pattern that can be used to filter out the files. The pattern has been constructed using the rules I mentioned in the comments.
I used the below Java program to test the pattern and subsequently deploy a sample Data Fusion pipeline with the SFTP actions plugin from the CDAP Hub. The pipeline copies files from an SFTP source to GCS.
The regex used is
[a-zA-Z0-9 ]*${file.name}[.a-z]*
and it filtered the files based on the date provided in the macro. The Data Fusion pipeline JSON can be found below. To test this pipeline you can import the JSON file and edit the file system properties as needed.The Data Fusion instance version is
6.4.1
.Note: According to this JIRA, it is suggested to use action plugins instead of source plugins since the SFTP/FTP source plugins are set to be removed in the future.