读取EDI格式文件
我是 EDI 新手,我有一个问题。
我读到,通过查看 ISA 行的最后 3 个字符,您可以获得有关 EDI 格式的大部分信息。如果每个 EDI 都使用换行符来分隔实体,那么这很好,但我发现许多都是单行文件,其中使用任意数量的字符作为换行符。我注意到我解析的每个 EDI 中的最后一个字符是中断字符。我看了几百个,没有发现任何例外。如果我首先获取该字符,并使用它来获取 ISA 行的最后 3 个字符,我是否应该合理地期望能够从 EDI 解析数据?
我不知道这是否有帮助,但所讨论的 EDI“类型”往往是 850、875。我不确定这是否是标准,但可能值得一提。
I'm new to EDI, and I have a question.
I have read that you can get most of what you need about an EDI format by looking at the last 3 characters of the ISA line. This is fine if every EDI used line breaks to separate entities, but I have found that many are single line files with any number of characters used as breaks. I have noticed that the VERY last character in every EDI I've parsed is the break character. I've looked at a few hundred, and have found no exceptions to this. If I first grab that character, and use that to obtain the last 3 of the ISA line, should I reasonably expect that I will be able to parse data from an EDI?
I don't know if this helps, but the EDI 'types' in question tend to be 850, 875. I'm not sure if that is a standard or not, but it may be worth mentioning.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
edi 的交易类型并不重要(850 = 订单,875 = 杂货店 po)。编写了一些 edi 解析器后,以下是我发现的一些内容:
您应该能够指望 ISA(并且仅 ISA)是固定宽度的(如果没记错的话,为 105 个字符)。
去掉前 105 个字符。之后和第一次出现“GS”之前的所有内容都是您的行终止符(这可以是任何内容,包括 0x07 - 蜂鸣声 - 所以请注意您是否输出到标准输出进行调试,否则您可能会听到一堆蜂鸣声从扬声器中出来)。通常这是 1 或 2 个字符,有时可能更多(如果向您发送数据的人出于某种原因添加了额外的终止符)。一旦有了行终止符,就可以获得段(字段)分隔符。我通常拉出 GS 行的第 3 个字符并使用它,尽管 ISA 行的第 4 个字符也应该可以工作。
另请注意,您可以获得包含多个 ISA 的文件。在这种情况下,您不能指望每个 ISA 中的行或字段分隔符相同。
另一件事.. edi 文件也有可能(再次不确定其规范)具有可变长度 ISA。这种情况非常罕见,但我必须适应它。如果发生这种情况,您必须将该行解析为其字段。 ISA 中的最后一个字段只有一个字符长,因此您可以从中确定 ISA 的实际长度。如果是我,我不会担心这个,除非你看到这样的文件。这是一种罕见的情况。
我上面所说的可能不符合“规范”的要求...也就是说,我不确定在同一文件中但在不同的 ISA 中使用不同的行分隔符是否合法,但在技术上是可能的我适应它是因为我必须处理以这种方式通过的文件。我使用的 edi 处理器每天处理超过 5000 个文件,其中包含超过 3000 个可能的数据源(所以我看到很多奇怪的东西)。
此致,
大学教师
the transaction type of edi doesn't really matter (850 = order, 875 = grocery po). having written a few edi parsers, here are a few things i've found:
you should be able to count on the ISA (and the ISA only) being fixed width (105 characters if memory serves).
strip off the first 105 characters. everything after that and before the first occurance of "GS" is your line terminator (this can be anything, include a 0x07 - the beep - so watch out if you're outputting to stdout for debugging or you may have a bunch of beeps coming out of the speaker). normally this is 1 or 2 characters, sometimes it can be more (if the person sending you the data adds an extra terminator for some reason). once you have the line terminator, you can get the segment (field) delimiter. i normally pull the 3 character of the GS line and use that, though the 4th character of the ISA line should work as well.
also be aware that you can get a file with multiple ISA's in it. in that case you cannot count on the line or field separators being the same within each ISA.
another thing .. it is also possible (again, not sure if its spec) for an edi file to have a variable length ISA. this is very rare, but i had to accommodate it. if that happens you have to parse the line into its fields. the last field in the ISA is only a character long, so you can determine the real length of the ISA from it. if it were me, i wouldn't worry about this unless you see a file like it. it is a rare occurance.
what i've said above may not be to the letter of the "spec" ... that is, i'm not sure its legal to have different line separators in the same file, but in different ISAs, but it is technically possible and I accommodate it because i have to process files that come through in that manner. the edi processor i use processes upwards of 5000 files a day with over 3000 possible sources of data (so i see a lot of weird stuff).
best regards,
don
EDI 内容由段和元素组成。
要解析它,您需要首先将其分成段,然后是像这样的元素(在 PHP 中):
希望这可以帮助您入门。
EDI content is composed of segments and elements.
To parse it, you will need to break it up into segments first, and then elements like so (in PHP):
Hope this helps get you started.
对于标头信息,以下 java 将让您非常轻松地获取基本信息。
C# 也有拆分,代码看起来非常相似
For header information the following java will let you get the basic info pretty easy.
C# has the split as well and the code looks very similar