当前位置：文江博客话题详情

解析该文件的最佳方法是什么？

发布于 2024-07-07 03:36:43 字数 2319 浏览 14 评论 0原文

我只是想知道是否有人知道我可以解析帖子底部的文件的好方法。

我有一个数据库设置，其中每个部分都有正确的表，例如参考表、呼叫者表、位置表。每个表都有与下面的文件中显示的相同的列，

我真的很喜欢一些相当遗传的东西，所以如果文件布局发生变化，它不会让我太困惑。目前，我只是一次读取一行文件，然后使用 case 语句来检查我所在的部分。

有人可以帮助我吗？

附言。我正在使用 VB，但 C# 或其他任何东西都可以，文档中的 x 只是我已空白的个人信息，

谢谢，内森

档案：--->

DIAL BEFORE YOU DIG
Call 1100, Fax 1300 652 077
PO Box 7710 MELBOURNE, VIC 8004

Utilities are requested to respond within 2 working days and reference the Sequence number.

[REFFERAL DETAILS]
FROM=                 Dial Before You Dig - Web
TO=                   Technical Services
UTILITY ID=           xxxxxx
COMPANY=              {Company Name}
ENQUIRY DATE=         02/10/2008 13:53
COMMENCEMENT DATE=    06/10/2008
SEQUENCE NO=          xxxxxxxxx
PLANNING=             No

[CALLER DETAILS]
CUSTOMER ID=          403552
CONTACT NAME=         {Name of Contact}
CONTACT HOURS=        0
COMPANY=              Underground Utility Locating
ADDRESS=              {Address}
SUBURB=               {Suburb}
STATE=                {State}
POSTCODE=             4350
TELEPHONE=            xxxxxxxxxx
MOBILE=               xxxxxxxxxx
FAX TYPE=             Private
FAX NUMBER=           xxxxxxxxxx
PUBLIC ADDRESS=       xxxxxxxxxx
PUBLIC TELEPHONE=
EMAIL ADDRESS=        {Email Address}

[LOCATION DETAILS]
ADDRESS=              {Location Address}
SUBURB=               {Location Suburb}
STATE=                xxx
POSTCODE=             xxx
DEPOSITED PLAN NO=    0
SECTION & HUNDRED NO= 0
PROPERTY PHONE NO=
SIDE OF STREET=       B
INTERSECTION=         xxxxxx
DISTANCE=             0-200m B
ACTIVITY CODE=        15
ACTIVITY DESCRIPTION= xxxxxxxxxxxxxxxxxx
MAP TYPE=             StateGrid
MAP REF=              Q851_63
MAP PAGE=
MAP GRID 1=
MAP GRID 2=
MAP GRID 3=
MAP GRID 4=
MAP GRID 5=
GPS X COORD=
GPS Y COORD=
PRIVATE/ROAD/BOTH=    B
TRAFFIC AFFECTED=     No
NOTIFICATION NO=      3082321
MESSAGE=              entire intersection of Allora-Clifton rd , Hillside
rd and merivale st

MOCSMESSAGE=          Digsafe generated referral

Notice: Please DO NOT REPLY TO THIS EMAIL as it has been automatically generated and replies are not monitored. Should you wish to advise Dial Before You Dig of any issues with this enquiry, please Call 1100

(See attached file: 3082321_LLGDA94.GML)

原文

I was just wondering if anyone knew of a good way that I could parse the file at the bottom of the post.

I have a database setup with the correct tables for each section eg Refferal Table,Caller Table,Location Table. Each table has the same columns that are show in the file below

I would really like something that is fairly genetic so if the file layout changes it won't mess me around to much. At the moment I am just reading the file in a line at a time and just using a case statement to check which section i'm in.

Is anyone able to help me with this?

PS. I am using VB but C# or anything else will be fine, also the x's in the document are just personal info I have blanked

Thanks,
Nathan

File:--->

DIAL BEFORE YOU DIG
Call 1100, Fax 1300 652 077
PO Box 7710 MELBOURNE, VIC 8004

Utilities are requested to respond within 2 working days and reference the Sequence number.

[REFFERAL DETAILS]
FROM=                 Dial Before You Dig - Web
TO=                   Technical Services
UTILITY ID=           xxxxxx
COMPANY=              {Company Name}
ENQUIRY DATE=         02/10/2008 13:53
COMMENCEMENT DATE=    06/10/2008
SEQUENCE NO=          xxxxxxxxx
PLANNING=             No

[CALLER DETAILS]
CUSTOMER ID=          403552
CONTACT NAME=         {Name of Contact}
CONTACT HOURS=        0
COMPANY=              Underground Utility Locating
ADDRESS=              {Address}
SUBURB=               {Suburb}
STATE=                {State}
POSTCODE=             4350
TELEPHONE=            xxxxxxxxxx
MOBILE=               xxxxxxxxxx
FAX TYPE=             Private
FAX NUMBER=           xxxxxxxxxx
PUBLIC ADDRESS=       xxxxxxxxxx
PUBLIC TELEPHONE=
EMAIL ADDRESS=        {Email Address}

[LOCATION DETAILS]
ADDRESS=              {Location Address}
SUBURB=               {Location Suburb}
STATE=                xxx
POSTCODE=             xxx
DEPOSITED PLAN NO=    0
SECTION & HUNDRED NO= 0
PROPERTY PHONE NO=
SIDE OF STREET=       B
INTERSECTION=         xxxxxx
DISTANCE=             0-200m B
ACTIVITY CODE=        15
ACTIVITY DESCRIPTION= xxxxxxxxxxxxxxxxxx
MAP TYPE=             StateGrid
MAP REF=              Q851_63
MAP PAGE=
MAP GRID 1=
MAP GRID 2=
MAP GRID 3=
MAP GRID 4=
MAP GRID 5=
GPS X COORD=
GPS Y COORD=
PRIVATE/ROAD/BOTH=    B
TRAFFIC AFFECTED=     No
NOTIFICATION NO=      3082321
MESSAGE=              entire intersection of Allora-Clifton rd , Hillside
rd and merivale st

MOCSMESSAGE=          Digsafe generated referral

Notice: Please DO NOT REPLY TO THIS EMAIL as it has been automatically generated and replies are not monitored. Should you wish to advise Dial Before You Dig of any issues with this enquiry, please Call 1100

(See attached file: 3082321_LLGDA94.GML)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

说谎友 2024-07-14 03:36:43

Google 就有答案，一旦您知道该文件格式称为“.ini”

编辑：也就是说，它是一个 .ini 加上一些额外的前导/尾随的垃圾。

回复收藏 0 原文

宛菡 2024-07-14 03:36:43

您可以按顺序读取文件的每一行。每一行本质上都是一个名称值对。将每个值放入按名称键控的映射（哈希表）中。每个部分都使用地图。解析完文件后，您将获得包含所有名称值对的映射。迭代每个地图并填充数据库表。

回复收藏 0 原文

流云如水 2024-07-14 03:36:43

我会使用 Python 来进行任何类型的字符串解析。我不确定您想要保留多少信息，但我可能会使用 Python 的 split() 函数在 = 上进行拆分，以去掉等号，然后去掉第二块饼中的空白。

首先，我会屏蔽掉我不需要的页眉/页脚信息，然后执行类似于以下操作的操作：

让我们取一块并将其保存在 test1.txt 中：

ADDRESS=              {Location Address}
SUBURB=               {Location Suburb}
STATE=                xxx
POSTCODE=             xxx
DEPOSITED PLAN NO=    0
SECTION & HUNDRED NO= 0
PROPERTY PHONE NO=

这是一个小 python代码片段：

>>> f = open("test1.txt", "r")
>>> l = f.readlines()
>>> l = [line.split('=') for line in l]
>>> for line in l:
    print line

['ADDRESS', '{Location Address}']
['SUBURB', '{Location Suburb}']
['STATE', 'xxx']
['POSTCODE', 'xxx']
['DEPOSITED PLAN NO', '0']
['SECTION & HUNDRED NO', '0']
['PROPERTY PHONE NO', '']

这本质上会给你一个 [Column, Value] 元组，你可以用它来将数据插入到数据库中（在转义所有字符串等之后，SQL 注入警告）。

这是假设电子邮件输入和您的数据库将具有相同的列名称，但如果没有，则使用字典设置列映射将相当简单。另一方面，如果电子邮件和列同步，您不需要知道列的名称来进行解析。

您可以迭代伪字典并在参数化 SQL 字符串的正确位置打印出每个键值对。

希望这可以帮助！

编辑：虽然这是在 Python 中，但 C#/VB.net 应该具有相同/相似的功能。

I would head to Python for any type of string parsing like this. I'm not sure how much of this information you want to retain, but I would perhaps use Python's split() function to split on = to get rid of the equals sign, then strip the whitespace out of the second piece of the pie.

First, I would mask out the header/footer info I know I don't need, then do something akin to the following:

Let's take a chunk and save it in test1.txt:

ADDRESS=              {Location Address}
SUBURB=               {Location Suburb}
STATE=                xxx
POSTCODE=             xxx
DEPOSITED PLAN NO=    0
SECTION & HUNDRED NO= 0
PROPERTY PHONE NO=

Here's a small python snippet:

>>> f = open("test1.txt", "r")
>>> l = f.readlines()
>>> l = [line.split('=') for line in l]
>>> for line in l:
    print line

['ADDRESS', '{Location Address}']
['SUBURB', '{Location Suburb}']
['STATE', 'xxx']
['POSTCODE', 'xxx']
['DEPOSITED PLAN NO', '0']
['SECTION & HUNDRED NO', '0']
['PROPERTY PHONE NO', '']

This would essentially give you a [Column, Value] tuple you could use to insert the data into your database (after escaping all strings, etc etc, SQL Injection warning).

This is assuming the email input and your DB will have the same column names, but if they didn't, it'd be fairly trivial to set up a column mapping using a dictionary. On the flip side, if the email and columns are in sync, you don't need to know the names of the columns to get the parsing down.

You could iterate through the pseudo-dictionary and print out each key-value pair in the right spot in your parameterized sql string.

Hope this helps!

Edit: While this is in Python, C#/VB.net should have the same/similar abilities.

回复收藏 0 原文

变身佩奇 2024-07-14 03:36:43

Using f As StreamReader = File.OpenText("sample.txt")
    Dim g As String = "undefined"
    Do
        Dim s As String = f.ReadLine
        If s Is Nothing Then Exit Do
        s = s.Replace(Chr(9), " ")
        If s.StartsWith("[") And s.EndsWith("]") Then
            g = s.Substring("[".Length, s.Length - "[]".Length)
        Else
            Dim ss() As String = s.Split(New Char() {"="c}, 2)
            If ss.Length = 2 Then
                Console.WriteLine("{0}.{1}={2}", g, Trim(ss(0)), Trim(ss(1)))
            End If
        End If
    Loop
End Using

Using f As StreamReader = File.OpenText("sample.txt")
    Dim g As String = "undefined"
    Do
        Dim s As String = f.ReadLine
        If s Is Nothing Then Exit Do
        s = s.Replace(Chr(9), " ")
        If s.StartsWith("[") And s.EndsWith("]") Then
            g = s.Substring("[".Length, s.Length - "[]".Length)
        Else
            Dim ss() As String = s.Split(New Char() {"="c}, 2)
            If ss.Length = 2 Then
                Console.WriteLine("{0}.{1}={2}", g, Trim(ss(0)), Trim(ss(1)))
            End If
        End If
    Loop
End Using

回复收藏 0 原文

~没有更多了~