XML 到 CSV 的转换,但很复杂
我发誓我已经查看了现有的线程!但我仍然需要帮助。
我需要将一些非常混乱的 XML 转换为非常整洁的 CSS 文件,以便上传到网站数据库。
我实际上并不需要完整的解决方案,但我需要帮助来理解解决 XSLT 问题时应遵循的流程。我不会要求你们都为我编码,只需告诉我我需要的元素和模板结构即可。我也希望社区能够解释该过程背后的逻辑,以便我可以根据需要对其进行修改。
我有 xml,其中包含所有订单和数字
<record-list>
<record>
<title>Title One</title
<author>Author One</author>
<subject>
Subject One A
Subject One B
Subject One C
</subject>
<subject>Subject Two</subject>
<subject>Subject Three</subject>
<subject>Subject Four</subject>
</record>
<record>
<subject>Subject Five</subject>
<title>Title Two</title>
<useless-element>Extra Stuff One</useless-element>
</record>
<record>
<title>Title Three</title>
<subject>Subject Six</subject>
<author/>
</record>
</record-list>
的记录:因此,我有多个重复元素、一些缺失元素、一些空元素、无序元素以及一些带有额外换行符的元素。
我需要一个如下所示的 CSV 文件,或者具有不同数量的主题重复(请参阅下面的要求)
"Title","Subject","Subject","Subject","Author"
"Title One","Subject One A ; Subject One B ; Subject One C","Subject Two","Subject Three","Author One"
"Title Two", "Subject Five","","",""
"Title Three","Subject Six","","",""
最终输出的要求
-任何重复元素的列数都需要与记录匹配具有该元素的最多重复次数,或者程序需要截断超过一定数量的任何重复。 - 每个新记录都需要换行符,并且文件中不能存在其他换行符(仅作为记录分隔符)。 - 每个记录的每个元素都需要具有相同的顺序。 -每个元素文本都需要用引号引起来(以处理内在逗号)。 - 缺少或空的元素需要空白、逗号包围的引号。 - 额外的元素无法发送到输出
我所做的:
我已经弄清楚如何使用翻译功能消除元素中的额外换行符,尽管我希望有一个解决方案,让我用多个字符替换换行符(现在我必须运行 find-and-replace 将输出中的占位符字符更改为空格分号空格)。我可以在输出中使用文本元素和条带空格获取引号、逗号和换行符。
但是,我不知道如何理顺元素的顺序,处理元素重复,或者仅插入一些元素,同时仍使用该元素作为换行符的提示。
现在,我只需要一个有效的解决方案,即使需要各种手动操作或多个样式表。我什至可以在文本编辑器中进行查找和替换,只要输出良好即可。请帮助解决 XSLT 问题,我什至不知道任何其他合适的编程语言(多年前的大学 matlab 没有帮助)。
我想我需要运行两次转换。我查看了 XSLT 圣经 Mangano 的 XSLT Cookbook,其中他使用了两种转换来解决类似的问题。然而,他的解决方案太笼统了,我无法理解。如果我无法弄清楚它是如何工作的,我就无法根据我的需要对其进行修改。抱歉,但如果没有编程背景,该网站和文本中的解释充其量也是具有挑战性的。然而,与本论坛上提出的其他问题相比,我认为我提出了一些新颖功能的问题。
任何帮助,无论是非通用代码,甚至只是通过我的处理器多次运行的建议过程,都会很棒。我已经为此苦苦挣扎了一个多星期,但进展甚微。
谢谢 CAMc
I swear I have looked at the existing threads! But I still need help.
I need to take some very messy XML and convert it to a very neat CSS file for upload to a website database.
I don't really need a finished solution, but I need help with understanding the process I should follow to solve my problem in XSLT. I won't ask you all to code for me, just tell me the elements and template structure I need. I would also love if the community could explain the logic behind the process, so that I can modify it as needed.
I have xml that has records in all orders and numbers:
<record-list>
<record>
<title>Title One</title
<author>Author One</author>
<subject>
Subject One A
Subject One B
Subject One C
</subject>
<subject>Subject Two</subject>
<subject>Subject Three</subject>
<subject>Subject Four</subject>
</record>
<record>
<subject>Subject Five</subject>
<title>Title Two</title>
<useless-element>Extra Stuff One</useless-element>
</record>
<record>
<title>Title Three</title>
<subject>Subject Six</subject>
<author/>
</record>
</record-list>
So I have multiple numbers of repeated elements, some missing elements, some empty elements, elements out of order, and some elements with extra line breaks.
I need a CSV file which reads as below, or with a different number of subject repeats (see requirements below)
"Title","Subject","Subject","Subject","Author"
"Title One","Subject One A ; Subject One B ; Subject One C","Subject Two","Subject Three","Author One"
"Title Two", "Subject Five","","",""
"Title Three","Subject Six","","",""
Requirements for the final output
-The number of columns of any repeated elements either needs to match the record with the most repeats of that element, or the program needs to chop off any repeats past a certain number.
-Each new record needs a line break and no other line breaks can exist in the files (only as record delimiters).
-The elements each need to be in the same order for each record.
-Each element text needs quotes around it (to handle intrinsic commas).
-Missing or empty elements need blank, comma surrounded quotes.
-Extra elements can't be sent through to the output
What I have done:
I have figured out how to get rid of the extra line breaks within the elements using the translate function, although I would love a solution that lets me replace the line breaks with more than one character (right now I will have to run find-and-replace to change a placeholder character to a space-semicolon-space in my output). I can get the quotes, commas, and line breaks in the output with text elements and strip-whitespace.
However, I don't know how to straighten out the order of the elements, handle the element repeats, or put through only some elements while still using the element as the cue for the line-break.
Right now, I just need a solution that works, even if all sorts of manual manipulation or multiple style-sheets are required. I can even do a find and replace in a text editor, as long as the output is good. Please help with an XSLT solution, I don't even begin to know any other suitable programing languages (college matlab many years ago is not helping).
I think I need to run two transforms. I looked at the XSLT bible, Mangano's XSLT Cookbook, where he used two transforms for a similar problem. However, his solution is so generalized, I can't understand it. If I can't figure out how it works, I can't modify it for my needs. Sorry, but without a programming background, the explanations on this site and in the text are challenging at best. However, I think I am presenting a problem with some novel features, compared to others asked on this forum.
Any help, be it non-generalized code, or even just a suggested procedure for multiple runs through my processor would be wonderful. I have been struggling with this for over a week and have made very little progress.
Thanks
CAMc
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我建议查看 XSLT 2.0 中的 CSV 到 XML 转换器< /a>.该页面上有很多有用的信息,包括如何运行它。
I'd suggest having a look at A CSV to XML converter in XSLT 2.0. There's a lot of useful info on that page, including how to run it.