C# sql-server-2005 state-machine data-import

导入并解析包含 PCL 的文本文件：ASP.NET C# 技术建议？

发布于 2024-10-26 10:35:13 字数 1254 浏览 8 评论 0原文

我需要抓取包含打印机控制语言 (PCL) 的旧大型机文本文件以进行数据导入。改变大型机功能不是一个选择。打印输出包含产品销售信息并具有分层输出。

我希望设置一个 Sql Server 集成服务导入 (SSIS)。最终，这将是一个带有 SQL 2005 数据库的数据导入 ASP.NET MVC 3 网站，因此我们可以避免 SSIS。我目前构建 C# ASP.NET MVC 3 网站，因此使用相关技术应该是可以管理的。

有没有人成功地在 C# 或 SSIS 中使用文本模式（如正则表达式）将文本报告解析回有用的数据导入？有没有使用状态设计模式的示例？

我发现很多这些答案显示答案的一小部分：如何加载文本文件并在 C# 中获取第 n 列。这个涉及的比较多。我需要根据我所处的导入状态使用模式来识别每种线类型。现成的软件会更好。

文本文件示例：

this part may be a header for the page which needs skipped
this part may be a header for the page which needs skipped
this part may be a header for the page which needs skipped

first line containing prices
  second line containing product description for the first line
    third line containing a related product (listing all flavors)
      fourth line containing a description for the third line
    [third and forth may repeat]
  [product set summary line]
[ repeat for next product]

this part may be a footer for the page that needs skipped
this part may be a footer for the page that needs skipped

at any point, the products will span between pages, 
having header and footer lines between product data.

原文

I need to scrape an old mainframe text file containing Printer Control Language (PCL) for a data import. Altering the mainframe functions isn't an option. The print out contains product sales information and has a hierarchical output.

My hope is that I setup a Sql Server Integration Service import (SSIS). Ultimately this will be a data import ASP.NET MVC 3 website with a SQL 2005 database, so we could avoid SSIS. I currently build C# ASP.NET MVC 3 websites, so using related technologies should be manageable.

Has anyone succeeded in parsing a text report back in to a useful data import with text patterns (like Regular Expressions) in C# or SSIS? Are there any examples out there using a state design pattern?

I find a lot of these answers showing a small part of the answer: how to load a text file and take the nth column in C#. This is more involved. I need to identify each line type with a pattern based on what import state I am within. Off the shelf software would be even better.

Text file example:

this part may be a header for the page which needs skipped
this part may be a header for the page which needs skipped
this part may be a header for the page which needs skipped

first line containing prices
  second line containing product description for the first line
    third line containing a related product (listing all flavors)
      fourth line containing a description for the third line
    [third and forth may repeat]
  [product set summary line]
[ repeat for next product]

this part may be a footer for the page that needs skipped
this part may be a footer for the page that needs skipped

at any point, the products will span between pages, 
having header and footer lines between product data.

分享到QQ

分享到微博