解析 SGML 并将其存储在 PHP 数组中

发布于 2024-08-19 05:50:26 字数 2129 浏览 5 评论 0原文

如果你能帮忙解决这个问题,你就是个天才。

基本上,我将有一些像这样的文本:

<parent wealthy>
   <parent>
      <children female>
        <child>
          jessica
          <hobbies>
            basketball, soccer, video games
          </hobbies>
        </child>
        <child>
          jane
          <hobbies>
            cooking, shopping, boys
          </hobbies>
        </child>         
      </children female>
      <children male>
       <child>
         josh
         <hobbies>
           tennis, swimming
         </hobbies>
       </child>
      </children male>
    </parent>
   </parent wealthy>
   <parent poor>
     <parent>
       <children male>
         <child>
          ---
          <hobbies>...</hobbies>
         </child>
       </children male>
     </parent>
   </parent poor>

所以总而言之,我将有一个像这样的父子层次结构:

- parent wealthy/ parent poor /parent something else
  -- parent
     -- children male/ children female / children something else
        -- child
         -- (name of the child is given without any tags around it)
         -- hobbies

我想知道如何解析所有这些信息并将它们存储在 php 数组/对象/变量中同时保持它们出现的顺序?例如,如果 出现在 上方,我希望将它们保持相同的顺序,并且如果 出现在 之前。

这将是几乎完全有效的 XML,我可以使用 SimpleXML 来解析它,但问题是子项的名称不会出现在任何标签之间,并且客户端希望保持这种方式以方便用户使用。例如:

    <child>
      jane
      <hobbies>
        cooking, shopping, boys
      </hobbies>
    </child>      

这里“jane”出现在任何标签之外,而出现在某些标签之间。

这该如何解析呢?请给一些建议。如果您建议使用正则表达式,请提供可用于您的答案被接受的正则表达式,因为我不知道正则表达式。

谢谢。

编辑:主要问题是客户端想要将普通文本与标签中的文本混合。例如:

text text test <hobbies>...<hobbies>. text text text <age>30</age>

如何解析?

If you can help with this you're a genius.

Basically, I will have some text like this:

<parent wealthy>
   <parent>
      <children female>
        <child>
          jessica
          <hobbies>
            basketball, soccer, video games
          </hobbies>
        </child>
        <child>
          jane
          <hobbies>
            cooking, shopping, boys
          </hobbies>
        </child>         
      </children female>
      <children male>
       <child>
         josh
         <hobbies>
           tennis, swimming
         </hobbies>
       </child>
      </children male>
    </parent>
   </parent wealthy>
   <parent poor>
     <parent>
       <children male>
         <child>
          ---
          <hobbies>...</hobbies>
         </child>
       </children male>
     </parent>
   </parent poor>

So in all, I will have a parent-child hierarchy like this:

- parent wealthy/ parent poor /parent something else
  -- parent
     -- children male/ children female / children something else
        -- child
         -- (name of the child is given without any tags around it)
         -- hobbies

I'm wondering how I can possibly parse all this info out and have them stored in a php array/object/variable while maintaining the order in which they appear? For example, if <parent wealthy> appears above <parent poor> I would like to keep them in the same order, and the same thing goes if <children male> appear before <children female>.

This would be almost perfectly valid XML and I could use SimpleXML to parse it, however the problem is that the name of the child doesn't appear between any tags and the client wants to keep it this way for user friendliness. for example:

    <child>
      jane
      <hobbies>
        cooking, shopping, boys
      </hobbies>
    </child>      

Here 'jane' appears outside any tags, and the <hobbies> appear between some tags.

How can this be parsed? Please give some advice. If you suggest using regexps, please give the regexps that can be used for your answer to be accepted, as I don't know regexps.

Thanks.

Edit: The main problem is that the client wants to mix normal text with text in tags. For example:

text text test <hobbies>...<hobbies>. text text text <age>30</age>

How can that be parsed?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

卷耳 2024-08-26 05:50:26

当使用这样的标记时:

<child>
  jane
   <hobbies>
    cooking, shopping, boys
   </hobbies>
 </child>     

当使用 SimpleXML 解析时,jane 将位于 child 元素的 nodeValue 属性中。

只需记住 trim() 该值,因为由于以下标记,它可能包含空格。

When using markup like this:

<child>
  jane
   <hobbies>
    cooking, shopping, boys
   </hobbies>
 </child>     

jane will be in the nodeValue attribute of the child element when parsed with SimpleXML.

Just remember to trim() the value, as it's likely to contain white space because of the following tag(s).

和影子一齐双人舞 2024-08-26 05:50:26

我觉得人们试图从技术角度回答这个问题,但这里的问题是流程。

为什么哦为什么?您的客户坚持输入这样的数据?这完全是荒谬的。即使验证它,你也会做噩梦。更不用说正确解析它了。

告诉他/她你为他们推出了一个不错的用户界面,选择你自己的存储机制,这将缓解用户通过这样输入而遇到的所有问题和不正确的格式。这是疯狂

另一件需要注意的完全不同的事情是,孩子似乎来自一个父母。我不知道智人是自花受精的。

I feel people are trying to answer the question from a technical point of view, but the issue here is process.

Why oh why? Your client is insisting on entering data like that? That is completely ridiculous. You will have a nightmare even validating it. Let alone parsing it properly.

Tell him/her you roll a decent user interface for them, choose your own storage mechanism and it will alleviate all the problems/issues and incorrect formatting that users will have by entering it like that. It is madness.

Another completely different thing to note is that it seems that children come from one parent. I wasn't aware homo sapiens was autogamous.

叹倦 2024-08-26 05:50:26

我看到你对其中一个答案的回复是......客户希望它对人们输入这个内容是用户友好的。
XML 结构是最不友好的信息输入方式之一。实际上是相当受虐的,而是使用 yaml yaml 并用 spyc

I saw your reply on one of the answers as ... the client wants it to be user friendly for people to type this.
An XML structure is one of the unfriendliest means of entering information. Actually is pretty much masochistic, rather use yaml yaml and parse it with spyc

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文