如何将 CSV 文件重新格式化为 Google 日历格式?

发布于 2024-11-09 17:31:50 字数 830 浏览 0 评论 0原文

因此,经过一些研究后,我找到了将 CSV 文件导入所需的格式。

Subject,Start Date,Start Time,End Date,End Time,All Day Event,Description,Location,Private

问题是,我正在使用的 CSV 导出的格式或顺序不正确,收集数据的最佳方法是什么?那个信息?这是我的一些来源。

名称、用户名、行类型、开始日期、开始时间、结束时间、结束日期、段开始日期、类型

“Smith, John J”,jjs,Shift,5/29/2011,9:30,17:30, 5/29/2011,5/29/2011,常规

“史密斯,约翰·J”,jjs,Shift,5/30/2011,13:30,17:30,5/30/2011,5/30/2011,常规的

    Dim Name As String = ""
    Dim UserName As String = ""

    Dim Data As String = """Smith, John J"",jj802b,Shift,5/29/2011,9:30,17:30,5/29/2011,5/29/2011,Transfer"

    For r As Integer = 1 To 10
        Name = Data.Substring(0, Data.LastIndexOf(""""))
        Data = Data.Remove(0, Data.LastIndexOf(""""))
        UserName = Data.Substring(Data.LastIndexOf(""""), ",")
    Next

So after doing some research I was able to find the format I need to get the CSV File into

Subject,Start Date,Start Time,End Date,End Time,All Day Event,Description,Location,Private

The issue is, the CSV export I'm working with is not in the correct format or order, what is the best way to go about gathering that information? Here is a bit of my source.

Name,User Name,Row Type,Start Date,Start Time,End Time,End Date,Segment Start Date,Type

"Smith, John J",jjs,Shift,5/29/2011,9:30,17:30,5/29/2011,5/29/2011,Regular

"Smith, John J",jjs,Shift,5/30/2011,13:30,17:30,5/30/2011,5/30/2011,Regular

    Dim Name As String = ""
    Dim UserName As String = ""

    Dim Data As String = """Smith, John J"",jj802b,Shift,5/29/2011,9:30,17:30,5/29/2011,5/29/2011,Transfer"

    For r As Integer = 1 To 10
        Name = Data.Substring(0, Data.LastIndexOf(""""))
        Data = Data.Remove(0, Data.LastIndexOf(""""))
        UserName = Data.Substring(Data.LastIndexOf(""""), ",")
    Next

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

千鲤 2024-11-16 17:31:50

以下是 Tim 的 DecodeCSV 的解决方案

Dim Name As String = ""
Dim UserName As String = ""

Dim Data As String = """Smith, John J"",jj802b,Shift,5/29/2011,9:30,17:30,5/29/2011,5/29/2011,Transfer"

For r As Integer = 1 To 10
    Dim DataArr() As String = DecodeCSV(Data) 'Use DecodeCSV function to regex split the string 
    Name = DataArr(0) 'Get First item of array as Name
    UserName = DataArr(1)  'Get Second item of array as UserName 
Next

Great Code

Public Shared Function DecodeCSV(ByVal strLine As String) As String()

    Dim strPattern As String
    Dim objMatch As Match

    ' build a pattern
    strPattern = "^" ' anchor to start of the string
    strPattern += "(?:""(?<value>(?:""""|[^""\f\r])*)""|(?<value>[^,\f\r""]*))"
    strPattern += "(?:,(?:[ \t]*""(?<value>(?:""""|[^""\f\r])*)""|(?<value>[^,\f\r""]*)))*"
    strPattern += "$" ' anchor to the end of the string

    ' get the match
    objMatch = Regex.Match(strLine, strPattern)

    ' if RegEx match was ok
    If objMatch.Success Then
        Dim objGroup As Group = objMatch.Groups("value")
        Dim intCount As Integer = objGroup.Captures.Count
        Dim arrOutput(intCount - 1) As String

        ' transfer data to array
        For i As Integer = 0 To intCount - 1
            Dim objCapture As Capture = objGroup.Captures.Item(i)
            arrOutput(i) = objCapture.Value

            ' replace double-escaped quotes
            arrOutput(i) = arrOutput(i).Replace("""""", """")
        Next

        ' return the array
        Return arrOutput
    Else
        Throw New ApplicationException("Bad CSV line: " & strLine)
    End If

End Function

Following is the solution

Dim Name As String = ""
Dim UserName As String = ""

Dim Data As String = """Smith, John J"",jj802b,Shift,5/29/2011,9:30,17:30,5/29/2011,5/29/2011,Transfer"

For r As Integer = 1 To 10
    Dim DataArr() As String = DecodeCSV(Data) 'Use DecodeCSV function to regex split the string 
    Name = DataArr(0) 'Get First item of array as Name
    UserName = DataArr(1)  'Get Second item of array as UserName 
Next

Great Code for DecodeCSV by Tim

Public Shared Function DecodeCSV(ByVal strLine As String) As String()

    Dim strPattern As String
    Dim objMatch As Match

    ' build a pattern
    strPattern = "^" ' anchor to start of the string
    strPattern += "(?:""(?<value>(?:""""|[^""\f\r])*)""|(?<value>[^,\f\r""]*))"
    strPattern += "(?:,(?:[ \t]*""(?<value>(?:""""|[^""\f\r])*)""|(?<value>[^,\f\r""]*)))*"
    strPattern += "$" ' anchor to the end of the string

    ' get the match
    objMatch = Regex.Match(strLine, strPattern)

    ' if RegEx match was ok
    If objMatch.Success Then
        Dim objGroup As Group = objMatch.Groups("value")
        Dim intCount As Integer = objGroup.Captures.Count
        Dim arrOutput(intCount - 1) As String

        ' transfer data to array
        For i As Integer = 0 To intCount - 1
            Dim objCapture As Capture = objGroup.Captures.Item(i)
            arrOutput(i) = objCapture.Value

            ' replace double-escaped quotes
            arrOutput(i) = arrOutput(i).Replace("""""", """")
        Next

        ' return the array
        Return arrOutput
    Else
        Throw New ApplicationException("Bad CSV line: " & strLine)
    End If

End Function
温暖的光 2024-11-16 17:31:50

根据 CSV 文件的确切内容和格式保证,为了速度和方便,有时使用 split on the , 是解析文件的最简单、最快的方法。 Your Name col 包含一个 ,,它不是分隔符,这增加了一点复杂性,尽管假设名称始终包含 1 ,,处理这种情况仍然很简单。

有一些库可以解析 CSV 文件,这很有用。假设您不需要处理所有符合 CSV 规范的文件,我觉得它们有点矫枉过正。综上所述,您可以使用以下 正则表达式 轻松解析名为的 CSV 文件说服团体:

"(?<名称>[^"]+?)",(?<用户名>[^,]+?),(?<行类型>[^,]+?),(?< ;开始日期>[^,]+?),(?[^,]+?),(?[^,]+?),(?[^,]+?),(?[^,]+? ),(?<类型>\w+)

这将创建命名捕获组,然后您可以使用它们输出到新的 CSV 文件,如下所示:

Dim ResultList As StringCollection = New StringCollection()
Try
    Dim RegexObj As New Regex("""(?<Name>[^""]+?)"",(?<UserName>[^,]+?),(?<RowType>[^,]+?),(?<StartDate>[^,]+?),(?<StartTime>[^,]+?),(?<EndTime>[^,]+?),(?<EndDate>[^,]+?),(?<SegmentStartDate>[^,]+?),(?<Type>\w+)", RegexOptions.IgnoreCase)
    Dim MatchResult As Match = RegexObj.Match(SubjectString)
    While MatchResult.Success
        'Append to new CSV file - MatchResult.Groups("groupname").Value

        'Name = MatchResult.Groups("Name").Value
        'Start Time = MatchResult.Groups("StartTime").Value         
        'End Time = MatchResult.Groups("EndTime").Value
        'Etc...
    End While
Catch ex As ArgumentException
    'Syntax error in the regular expression
End Try

请参阅 .NET Framework 正则表达式

Depending on the exact content and guarantee of format of a CSV file, for speed and ease, sometimes using split on the , is the easiest and fastest way to parse the file. Your Name col includes a , that is not a delimiter, which adds a little bit of complication though it is still trivial to handle that case assuming the name always contains 1 ,.

There are libraries to parse CSV files, which can be useful. Assuming you don't need to handle all files that conform to the CSV spec I feel that they are overkill. With all that said you can use the following regular expression to easily parse the CSV file with named groups for convince:

"(?<Name>[^"]+?)",(?<UserName>[^,]+?),(?<RowType>[^,]+?),(?<StartDate>[^,]+?),(?<StartTime>[^,]+?),(?<EndTime>[^,]+?),(?<EndDate>[^,]+?),(?<SegmentStartDate>[^,]+?),(?<Type>\w+)

That will create named capture groups that you can then use to output to your new CSV file like so:

Dim ResultList As StringCollection = New StringCollection()
Try
    Dim RegexObj As New Regex("""(?<Name>[^""]+?)"",(?<UserName>[^,]+?),(?<RowType>[^,]+?),(?<StartDate>[^,]+?),(?<StartTime>[^,]+?),(?<EndTime>[^,]+?),(?<EndDate>[^,]+?),(?<SegmentStartDate>[^,]+?),(?<Type>\w+)", RegexOptions.IgnoreCase)
    Dim MatchResult As Match = RegexObj.Match(SubjectString)
    While MatchResult.Success
        'Append to new CSV file - MatchResult.Groups("groupname").Value

        'Name = MatchResult.Groups("Name").Value
        'Start Time = MatchResult.Groups("StartTime").Value         
        'End Time = MatchResult.Groups("EndTime").Value
        'Etc...
    End While
Catch ex As ArgumentException
    'Syntax error in the regular expression
End Try

See .NET Framework Regular Expressions on MSDN for more information.

尬尬 2024-11-16 17:31:50

我想注意一些事情:

  • 一是我正在使用
    TextFieldParser,您可以找到它
    FileIO 命名空间下工作
    使用输入 CSV。这使得
    读取分隔文件更容易
    而不是尝试处理常规问题
    表达式和你自己的解析,
    等等。
  • 另一个是存储
    数据集我正在使用 List(Of
    Dictionary(Of String, String))
    ,或
    相关词典列表
    字符串到其他字符串。本质上
    这与
    DataTable 的访问模式和
    如果你对此感到更舒服
    构建,欢迎您使用它
    反而。词典列表
    行为完全相同并且需要
    设置少了很多,所以这里使用它
    取而代之的是。

我承认其中一些是硬编码的,但如果您需要概括该过程,您可以将某些方面移至应用程序设置和/或更好地分解该功能。这里的重点是给您一个总体概念。该代码内联注释如下:

    ' Create a text parser object
    Dim theParser As New FileIO.TextFieldParser("C:\Path\To\theInput.csv")

    ' Specify that fields are delimited by commas
    theParser.Delimiters = {","}

    ' Specify that strings containing the delimiter are wrapped by quotes
    theParser.HasFieldsEnclosedInQuotes = True

    ' Dimension containers for the field names and the list of data rows
    ' Initialize the field names with the first row r
    Dim theInputFields As String() = theParser.ReadFields(),
        theInputRows As New List(Of Dictionary(Of String, String))()

    ' While there is data to parse
    Do While Not theParser.EndOfData

        ' Dimension a counter and a row container
        Dim i As Integer = 0,
            theRow As New Dictionary(Of String, String)()

        ' For each field
        For Each value In theParser.ReadFields()

            ' Associate the value of that field for the row
            theRow(theInputFields(i)) = value

            ' Increment the count
            i += 1
        Next

        ' Add the row to the list
        theInputRows.Add(theRow)
    Loop

    ' Close the input file for reading
    theParser.Close()

    ' Dimension the list of output field names and a container for the list of formatted output rows
    Dim theOutputFields As New List(Of String) From {"Subject", "Start Date", "Start Time", "End Date", "End Time", "All Day Event", "Description", "Location", "Private"},
        theOutputRows As New List(Of Dictionary(Of String, String))()

    ' For each data row we've extracted from the CSV
    For Each theRow In theInputRows

        ' Dimension a new formatted row for the output
        Dim thisRow As New Dictionary(Of String, String)()

        ' For each field name of the output rows
        For Each theField In theOutputFields

            ' Dimension a container for the value of this field
            Dim theValue As String = String.Empty

            ' Specify ways to get the value of the field based on its name
            ' These are just examples; choose your own method for formatting the output
            Select Case theField

                Case "Subject"
                    ' Output a subject "[Row Type]: [Name]"
                    theValue = theRow("Row Type") & ": " & theRow("Name")

                Case "Description"
                    ' Output a description from the input field [Type]
                    theValue = theRow("Type")

                Case "Start Date", "Start Time", "End Date", "End Time"
                    ' Output the value of the field with a correlated name
                    theValue = theRow(theField)

                Case "All Day Event", "Private"
                    ' Output False by default (you might want to change the case for Private
                    theValue = "False"

                Case "Location"
                    ' Can probably be safely left empty unless you'd like a default value
            End Select

            ' Relate the value we've created to the column in this row
            thisRow(theField) = theValue
        Next

        ' Add the formatted row to the output data
        theOutputRows.Add(thisRow)
    Next

    ' Start building the first line by retriving the name of the first output field
    Dim theHeader As String = theOutputFields.First

    ' For each of the remaining output fields
    For Each theField In (From s In theOutputFields Skip 1)

        ' Append a comma and then the field name
        theHeader = theHeader & "," & theField
    Next

    ' Create a string builder to store the text for the output file, initialized with the header line and a line break
    Dim theOutput As New System.Text.StringBuilder(theHeader & vbNewLine)

    ' For each row in the formatted output rows
    For Each theRow In theOutputRows

        ' Dimension a container for this line of the file, beginning with the value of the column associated with the first output field
        Dim theLine As String = theRow(theOutputFields.First)

        ' Wrap the first value if necessary
        If theLine.Contains(",") Then theLine = """" & theLine & """"

        ' For each remaining output field
        For Each theField In (From s In theOutputFields Skip 1)

            ' Dereference and store the associated column value
            Dim theValue As String = theRow(theField)

            ' Add a comma and the value to the line, wrapped in quotations as needed
            theLine = theLine & "," & If(theValue.Contains(","), """" & theValue & """", theValue)
        Next

        ' Append the line to the output string
        theOutput.AppendLine(theLine)
    Next

    ' Write the formatted output to file
    IO.File.WriteAllText("C:\output.csv", theOutput.ToString)

就其价值而言,使用示例数据似乎可以使用此代码在 OpenOffice.org Calc 中很好地打开输出文件。您希望为字段输出的格式由您决定,因此请修改 Select 中相应的 Case 语句来执行此操作,祝您编码愉快!

A few things I'd like to note:

  • One is that I'm using a
    TextFieldParser, which you can find
    under the FileIO namespace, to work
    with the input CSV. This makes
    reading delimited files a lot easier
    than trying to deal with regular
    expressions and your own parsing,
    etc.
  • The other is that to store the
    data sets I am using a List(Of
    Dictionary(Of String, String))
    , or a
    list of dictionaries that relate
    strings to other strings. Essentially
    this is not much different from the
    access pattern of a DataTable and
    if you are more comfortable with that
    construct, you're welcome to use it
    instead. The list of dictionaries
    behaved exactly the same and required
    a lot less setup so it is used here
    in its stead.

I admit some of this is hard-coded but if you need to generalize the procedure, you can move certain aspects out to application settings and/or better decompose the function. The point here was to give you a general idea. The code is commented inline below:

    ' Create a text parser object
    Dim theParser As New FileIO.TextFieldParser("C:\Path\To\theInput.csv")

    ' Specify that fields are delimited by commas
    theParser.Delimiters = {","}

    ' Specify that strings containing the delimiter are wrapped by quotes
    theParser.HasFieldsEnclosedInQuotes = True

    ' Dimension containers for the field names and the list of data rows
    ' Initialize the field names with the first row r
    Dim theInputFields As String() = theParser.ReadFields(),
        theInputRows As New List(Of Dictionary(Of String, String))()

    ' While there is data to parse
    Do While Not theParser.EndOfData

        ' Dimension a counter and a row container
        Dim i As Integer = 0,
            theRow As New Dictionary(Of String, String)()

        ' For each field
        For Each value In theParser.ReadFields()

            ' Associate the value of that field for the row
            theRow(theInputFields(i)) = value

            ' Increment the count
            i += 1
        Next

        ' Add the row to the list
        theInputRows.Add(theRow)
    Loop

    ' Close the input file for reading
    theParser.Close()

    ' Dimension the list of output field names and a container for the list of formatted output rows
    Dim theOutputFields As New List(Of String) From {"Subject", "Start Date", "Start Time", "End Date", "End Time", "All Day Event", "Description", "Location", "Private"},
        theOutputRows As New List(Of Dictionary(Of String, String))()

    ' For each data row we've extracted from the CSV
    For Each theRow In theInputRows

        ' Dimension a new formatted row for the output
        Dim thisRow As New Dictionary(Of String, String)()

        ' For each field name of the output rows
        For Each theField In theOutputFields

            ' Dimension a container for the value of this field
            Dim theValue As String = String.Empty

            ' Specify ways to get the value of the field based on its name
            ' These are just examples; choose your own method for formatting the output
            Select Case theField

                Case "Subject"
                    ' Output a subject "[Row Type]: [Name]"
                    theValue = theRow("Row Type") & ": " & theRow("Name")

                Case "Description"
                    ' Output a description from the input field [Type]
                    theValue = theRow("Type")

                Case "Start Date", "Start Time", "End Date", "End Time"
                    ' Output the value of the field with a correlated name
                    theValue = theRow(theField)

                Case "All Day Event", "Private"
                    ' Output False by default (you might want to change the case for Private
                    theValue = "False"

                Case "Location"
                    ' Can probably be safely left empty unless you'd like a default value
            End Select

            ' Relate the value we've created to the column in this row
            thisRow(theField) = theValue
        Next

        ' Add the formatted row to the output data
        theOutputRows.Add(thisRow)
    Next

    ' Start building the first line by retriving the name of the first output field
    Dim theHeader As String = theOutputFields.First

    ' For each of the remaining output fields
    For Each theField In (From s In theOutputFields Skip 1)

        ' Append a comma and then the field name
        theHeader = theHeader & "," & theField
    Next

    ' Create a string builder to store the text for the output file, initialized with the header line and a line break
    Dim theOutput As New System.Text.StringBuilder(theHeader & vbNewLine)

    ' For each row in the formatted output rows
    For Each theRow In theOutputRows

        ' Dimension a container for this line of the file, beginning with the value of the column associated with the first output field
        Dim theLine As String = theRow(theOutputFields.First)

        ' Wrap the first value if necessary
        If theLine.Contains(",") Then theLine = """" & theLine & """"

        ' For each remaining output field
        For Each theField In (From s In theOutputFields Skip 1)

            ' Dereference and store the associated column value
            Dim theValue As String = theRow(theField)

            ' Add a comma and the value to the line, wrapped in quotations as needed
            theLine = theLine & "," & If(theValue.Contains(","), """" & theValue & """", theValue)
        Next

        ' Append the line to the output string
        theOutput.AppendLine(theLine)
    Next

    ' Write the formatted output to file
    IO.File.WriteAllText("C:\output.csv", theOutput.ToString)

For what it's worth, using your sample data seemed to result in the output file opening just fine in OpenOffice.org Calc using this code. The format of what you wish to output for the fields is up to you, so modify the appropriate Case statement in the Select to do so, and happy coding!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文