Pentaho Spoon - 验证固定宽度输入文件格式
我正在尝试在 pentaho 中处理固定宽度的输入文件并验证格式。该文件将是字符串、数字和日期的混合。然而,当尝试处理存在不正确字符的数字字段时(我预计会抛出错误),它只会读取数字的第一部分并忽略错误的字符。
我可以使用包含单个字段的非常简单的输入文件重新创建此问题:
我指定了预期的数字格式,以及起始位置和长度:
在运行转换时,我预计“Q”会导致错误,而不是显示以下结果,只需读取前两位数字“67”并填充其余数字以匹配指定的格式:
如果输入文件格式正确运行得很好,但需要它否则会抛出错误。任何建议都会很棒。谢谢!
I'm trying to process a fixed width input file in pentaho and validate the format. The file will be a mixture of strings, numbers and dates. However when attempting to process a number field that has an incorrect character present (which i had expected would throw an error) it just reads the first part of the number and ignores the bad char.
I can recreate this issue with a very simple input file containing a single field:
I specify the expected number format, along with start position and length:
On running the transformation i would have expected the 'Q' to cause an error instead the following result is displayed, just reading the first two digits "67" and padding the rest to match the specified format:
If the input file is formatted correctly it runs perfectly well, but need it to throw an error otherwise. Any suggestions would be awesome. Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
仅供参考,以防有人在遇到与我相同的问题后偶然发现这个问题。
我能够通过将“文本文件输入”步骤中的所有值作为字符串读取,然后使用配备正则表达式评估的“数据验证器”步骤来构建解决方法,以确保在使用以下“解析为数字类型之前数字格式正确”选择值”步骤。
对于每个领域来说,这样做都需要更长的时间,但这是我能想到的最强大的解决方案。
谢谢
Just an FYI in case someone stumbles accross this question after hitting the same issues as myself.
I was able to construct a workaround by reading all values in the "Text File Input" step as strings, and then using a "Data Validator" step equipped with regex evaluation to ensure numbers were correctly formatted before parsing to number type with a following "Select Values" step.
Takes a bit longer to do this for every field, but was the most robust solution i could come up with.
Thanks