有 XML 文件需要它来填充多个 SQL 表

发布于 2024-09-12 11:46:19 字数 1384 浏览 5 评论 0原文

我有一个 XML 文件,需要填充多个 SQL 表,我想知道最好的方法是什么。我在考虑数据集或 xslt,但老实说我不确定。这是我生成的 XML(其中一部分),

 <?xml version="1.0" encoding="utf-8" standalone="yes" ?> 
- <!-- Created: 8/3/2010 12:09:15 PM
  --> 
- <Trip>
- <TripDetails>
  <DepartureDate /> 
  <ReturnDate /> 
  <TripTypeA>3</TripTypeA> 
  <TripTypeB>1</TripTypeB> 
  <PurposeOfTrip>vacation</PurposeOfTrip> 
  <Region>5</Region> 
- <Countries>
  <Country>105</Country> 
  <Country>135</Country> 
  </Countries>
- <Cities>
  <City>Cancun</City> 
  <City>Tokyo</City> 
  <City>Mayo</City> 
  </Cities>
  <OverallRating>4</OverallRating> 
  <Suppliers>53</Suppliers> 
  <SuppliersComments>Good flight</SuppliersComments> 
- <Transport>
  <TransportType>1</TransportType> 
  <TransportType>3</TransportType> 
  </Transport>
  <TransportComment>Transportation was fast</TransportComment> 

我有几个需要填充的不同表。(例如,保持简短)

TripDetails (TripID, TripTypeA, TripTypeB, SupplierID, overallRating)
TripCountries (TripCountryID, TripID, CountryCode)

我还有更多表(城市、交通),但如果我能弄清楚如何更新 TripDetails(主表)和TripCountries(这是一个汇集了TripDetails和国家的表)我想我会很好,谢谢!

I have a XML file that I need to populate multiple SQL tables, and I was wondering what the best way to do that is. I was thinking dataset, or xslt but I honestly not sure. Here is my generated XML (part of it)

 <?xml version="1.0" encoding="utf-8" standalone="yes" ?> 
- <!-- Created: 8/3/2010 12:09:15 PM
  --> 
- <Trip>
- <TripDetails>
  <DepartureDate /> 
  <ReturnDate /> 
  <TripTypeA>3</TripTypeA> 
  <TripTypeB>1</TripTypeB> 
  <PurposeOfTrip>vacation</PurposeOfTrip> 
  <Region>5</Region> 
- <Countries>
  <Country>105</Country> 
  <Country>135</Country> 
  </Countries>
- <Cities>
  <City>Cancun</City> 
  <City>Tokyo</City> 
  <City>Mayo</City> 
  </Cities>
  <OverallRating>4</OverallRating> 
  <Suppliers>53</Suppliers> 
  <SuppliersComments>Good flight</SuppliersComments> 
- <Transport>
  <TransportType>1</TransportType> 
  <TransportType>3</TransportType> 
  </Transport>
  <TransportComment>Transportation was fast</TransportComment> 

I have a couple different tables I need populating.(keeping it short for example)

TripDetails (TripID, TripTypeA, TripTypeB, SupplierID, overallRating)
TripCountries (TripCountryID, TripID, CountryCode)

I have a bunch more tables(cities, transport) but if I can figure out how to update TripDetails(the main table) and TripCountries (which is a table that brings together TripDetails, and Countries) I think I will be good, thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

满意归宿 2024-09-19 11:46:19

假设您使用的是 SQL Server,您应该将 XML 解析为 DataTable 并使用 SqlBulkCopy 对象将它们超快速地发送到数据库中。有很多资源可以帮助您了解 SqlBulkCopy。以下是另一个 StackOverflow 问题的最新讨论,可帮助您入门:Sql Server 2008 大型事务调整(700k+ 行/事务)

如果 XML 文件确实很大,则应小心使用哪种解析器。 XDocument 和 XmlDocument 将整个内容加载到内存中。如果文件足够小,比如小于 10MB,那么使用这些解析器应该没问题。


编辑:

下面是如何将 XML 放入 DataTable 的快速模型。它是用 VB 编写的,因为 VB 使 XML 变得更容易一些。

Option Strict On : Option Explicit On : Option Infer On : Option Compare Binary

Imports System.Data
Imports System.Linq
Imports System.Xml.Linq

Module Module1

    Sub Main()
      Dim xml =
         <Trip>
            <TripDetails id="1">
               <DepartureDate/> 
               <ReturnDate/> 
               <TripTypeA>3</TripTypeA> 
               <TripTypeB>1</TripTypeB> 
               <PurposeOfTrip>vacation</PurposeOfTrip> 
               <Region>5</Region> 
               <Countries>
                  <Country>105</Country> 
                  <Country>135</Country> 
               </Countries>
               <Cities>
                  <City>Cancun</City> 
                  <City>Tokyo</City> 
                  <City>Mayo</City> 
               </Cities>
               <OverallRating>4</OverallRating> 
               <Suppliers>53</Suppliers> 
               <SuppliersComments>Good flight</SuppliersComments> 
               <Transport>
                  <TransportType>1</TransportType> 
                  <TransportType>3</TransportType> 
               </Transport>
               <TransportComment>Transportation was fast</TransportComment>
            </TripDetails>
         </Trip>

         Dim dtTripDetails As New DataTable()
         With dtTripDetails.Columns
            .Add("TripID", GetType(Integer))
            .Add("TripTypeA", GetType(Integer))
            .Add("DepartureDate", GetType(DateTime))
            .Add("TransportComment", GetType(String))
         End With

         Dim dtTripDetailXrefCountries As New DataTable()
         With dtTripDetailXrefCountries.Columns
            .Add("TripID", GetType(Integer))
            .Add("CountryID", GetType(Integer))
         End With

         Dim xdetails = From td In xml.Descendants("TripDetails") Select td
         For Each xdetailRecord As XElement In xdetails
            Dim tripID As Integer = CInt(xdetailRecord.Attribute("id").Value)
            Dim tripTypeA As Integer = CInt(xdetailRecord.Element("TripTypeA").Value)
            Dim strDepDate As String = xdetailRecord.Element("DepartureDate").Value
            Dim depDate As Object = If(String.IsNullOrEmpty(strDepDate), CType(DBNull.Value, Object), CType(DateTime.Parse(strDepDate), Object))
            Dim transportComment As String = xdetailRecord.Element("TransportComment").Value
            dtTripDetails.Rows.Add(tripID, tripTypeA, depDate, transportComment)

            Dim xcountries = From c In xdetailRecord.Element("Countries").Elements("Country") Select c
            For Each xcountryRecord As XElement In xcountries
               Dim countryID As Integer = CInt(xcountryRecord.Value)
               dtTripDetailXrefCountries.Rows.Add(tripID, countryID)
            Next
         Next

         Console.WriteLine("Done")
         Console.ReadKey(True)

    End Sub

End Module

顺便说一句 - 在进行此类 ETL 时,最好先将数据放入暂存表中,而不是直接放入生产表中。这样,您就可以验证数据类型并确保引用完整性并处理密钥管理,并使所有内容都完美定位,而不会锁定或污染生产表。

Assuming you're using SQL Server, you should parse the XML into DataTables and use the SqlBulkCopy object to shoot them into the database super-fast. There are lots of resources to help you learn about SqlBulkCopy. Here's a recent discussion from another StackOverflow question to get you started: Sql Server 2008 Tuning with large transactions (700k+ rows/transaction)

If the XML file is really large, you should be careful what sort of parser you use. XDocument and XmlDocument load the whole thing into memory. If the files are small enough, say under 10MB, you should be fine using those parsers.


EDIT:

Here's a quick mock-up of how you could get the XML into DataTables. It's in VB since VB makes XML a tad easier.

Option Strict On : Option Explicit On : Option Infer On : Option Compare Binary

Imports System.Data
Imports System.Linq
Imports System.Xml.Linq

Module Module1

    Sub Main()
      Dim xml =
         <Trip>
            <TripDetails id="1">
               <DepartureDate/> 
               <ReturnDate/> 
               <TripTypeA>3</TripTypeA> 
               <TripTypeB>1</TripTypeB> 
               <PurposeOfTrip>vacation</PurposeOfTrip> 
               <Region>5</Region> 
               <Countries>
                  <Country>105</Country> 
                  <Country>135</Country> 
               </Countries>
               <Cities>
                  <City>Cancun</City> 
                  <City>Tokyo</City> 
                  <City>Mayo</City> 
               </Cities>
               <OverallRating>4</OverallRating> 
               <Suppliers>53</Suppliers> 
               <SuppliersComments>Good flight</SuppliersComments> 
               <Transport>
                  <TransportType>1</TransportType> 
                  <TransportType>3</TransportType> 
               </Transport>
               <TransportComment>Transportation was fast</TransportComment>
            </TripDetails>
         </Trip>

         Dim dtTripDetails As New DataTable()
         With dtTripDetails.Columns
            .Add("TripID", GetType(Integer))
            .Add("TripTypeA", GetType(Integer))
            .Add("DepartureDate", GetType(DateTime))
            .Add("TransportComment", GetType(String))
         End With

         Dim dtTripDetailXrefCountries As New DataTable()
         With dtTripDetailXrefCountries.Columns
            .Add("TripID", GetType(Integer))
            .Add("CountryID", GetType(Integer))
         End With

         Dim xdetails = From td In xml.Descendants("TripDetails") Select td
         For Each xdetailRecord As XElement In xdetails
            Dim tripID As Integer = CInt(xdetailRecord.Attribute("id").Value)
            Dim tripTypeA As Integer = CInt(xdetailRecord.Element("TripTypeA").Value)
            Dim strDepDate As String = xdetailRecord.Element("DepartureDate").Value
            Dim depDate As Object = If(String.IsNullOrEmpty(strDepDate), CType(DBNull.Value, Object), CType(DateTime.Parse(strDepDate), Object))
            Dim transportComment As String = xdetailRecord.Element("TransportComment").Value
            dtTripDetails.Rows.Add(tripID, tripTypeA, depDate, transportComment)

            Dim xcountries = From c In xdetailRecord.Element("Countries").Elements("Country") Select c
            For Each xcountryRecord As XElement In xcountries
               Dim countryID As Integer = CInt(xcountryRecord.Value)
               dtTripDetailXrefCountries.Rows.Add(tripID, countryID)
            Next
         Next

         Console.WriteLine("Done")
         Console.ReadKey(True)

    End Sub

End Module

BTW - when doing this kind of ETL, it's best to pump your data into staging tables first rather than directly into your production tables. That way, you can validate data types and ensure referential integrity and handle key management and get everything perfectly situated without locking up or polluting your production tables.

西瑶 2024-09-19 11:46:19

您可以非常轻松地使用 SQL Server 2005 的 XQuery 功能,例如,拥有一个接受这些 XML 字符串作为输入的存储过程:

CREATE PROCEDURE dbo.StoreTripDetails(@XmlContent XML)
AS BEGIN
   DECLARE @NewTripID INT

   INSERT INTO dbo.TripDetails(TripTypeA, TripTypeB, SupplierID, overallRating)
       SELECT
        Trip.Details.value('(TripTypeA)[1]', 'int') 'TripTypeA',
        Trip.Details.value('(TripTypeB)[1]', 'int') 'TripTypeB',
        Trip.Details.value('(Suppliers)[1]', 'int') 'SupplierID',
        Trip.Details.value('(OverallRating)[1]', 'int') 'OverallRating'
       FROM
        @XmlContent.nodes('/Trip/TripDetails') AS Trip(Details)

    SELECT @NewTripID = SCOPE_IDENTITY()

    INSERT INTO dbo.TripCountries(TripID, CountryCode)
        SELECT
        @NewTripID, Trip.Countries.value('(.)[1]', 'int') 
        FROM
        @XmlContent.nodes('/Trip/TripDetails/Countries/Country') AS Trip(Countries)
 END

然后您可以非常轻松地从 C# 代码中调用该存储过程,并传入代表一次行程的 XML 块。

You could use SQL Server 2005's XQuery capabilities pretty easily, e.g. have a stored proc that accepts on of those XML strings as input:

CREATE PROCEDURE dbo.StoreTripDetails(@XmlContent XML)
AS BEGIN
   DECLARE @NewTripID INT

   INSERT INTO dbo.TripDetails(TripTypeA, TripTypeB, SupplierID, overallRating)
       SELECT
        Trip.Details.value('(TripTypeA)[1]', 'int') 'TripTypeA',
        Trip.Details.value('(TripTypeB)[1]', 'int') 'TripTypeB',
        Trip.Details.value('(Suppliers)[1]', 'int') 'SupplierID',
        Trip.Details.value('(OverallRating)[1]', 'int') 'OverallRating'
       FROM
        @XmlContent.nodes('/Trip/TripDetails') AS Trip(Details)

    SELECT @NewTripID = SCOPE_IDENTITY()

    INSERT INTO dbo.TripCountries(TripID, CountryCode)
        SELECT
        @NewTripID, Trip.Countries.value('(.)[1]', 'int') 
        FROM
        @XmlContent.nodes('/Trip/TripDetails/Countries/Country') AS Trip(Countries)
 END

and then you could call this stored proc from your C# code quite easily and pass in a chunk of XML representing one trip.

心的憧憬 2024-09-19 11:46:19

您需要指定您正在使用哪个数据库引擎。由于您已指定 Linq 标记,我假设您正在使用 Microsoft SQL Server。

您可以轻松地将文档作为字符串传递给存储过程。在此存储过程中,您定义一些临时表架构并使用 sp_xml_preparedocument、OPENXML(和相关函数)用 XML 文档中的数据填充这些临时表。现在,您只需使用标准 SQL,使用您需要的任何投影、联接或其他转换,将临时表插入到永久表中。大多数时候内存也不再是问题,因为这一切都在 MSSQL Server 内完成。

更多信息:
http://msdn.microsoft.com/en-us /library/aa260385(SQL.80).aspx
http://msdn.microsoft.com/en-us /library/aa276847(SQL.80).aspx

示例(未经测试):

declare @docHandle int
declare @xmlContent varchar(500)
set @doc ='<root><items><item value="1"/><item value="2"/></item></root>'

exec sp_xml_preparedocument @docHandle OUTPUT, @xmlContent

insert into myTable ([ItemValue])
select value from OPENXML(@docHandle, '/root/items/item')

exec sp_xml_removedocument @docHandle 

请注意,这些是 SQL 2000 约定。在 SQL 2005 中,存在等效的 T-SQL 约定,它们比这些存储过程前身更具可读性。然而,我相信这些约定在 SQL 2005 和 SQL 2008 中仍然可用。

You need to specify which database engine you're using. Since you've specified the Linq tag, I assume you're using Microsoft SQL server.

You can easily pass the document as a string to a stored procedure. In this stored procedure, you define some temp table schemas and use sp_xml_preparedocument, OPENXML (and related functions) to populate these temp tables with the data from the XML document. Now you can just use standard SQL to insert into your permanent tables from the temp tables using any projections, joins or other transformations you need. Memory is also no longer an issue most of the time, since it is all done inside MSSQL Server.

More info:
http://msdn.microsoft.com/en-us/library/aa260385(SQL.80).aspx
http://msdn.microsoft.com/en-us/library/aa276847(SQL.80).aspx

Example (untested):

declare @docHandle int
declare @xmlContent varchar(500)
set @doc ='<root><items><item value="1"/><item value="2"/></item></root>'

exec sp_xml_preparedocument @docHandle OUTPUT, @xmlContent

insert into myTable ([ItemValue])
select value from OPENXML(@docHandle, '/root/items/item')

exec sp_xml_removedocument @docHandle 

Note that these are SQL 2000 conventions. In SQL 2005, there are equivalent T-SQL conventions which are more readable than these stored procedure predecessors. However, these conventions are still available in SQL 2005 and SQL 2008, I believe.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文