我有一个Excel文件.xlsx,我将从客户端获得,在某些行数据之后,它将有一个空白的行,然后每次都有一个通用的注释,该短语的起始作品始终相同,而不是整个短语。
我只想直到第5行(包括标头为列)执行SSIS,并且不想处理第6和7行,并且每次数据都会更改时,因此在这种情况下我什至无法进行范围。
我的灵活性要么在PREM SSIS或ADF上进行。
尝试使用过滤器,但由于单元格中的文本在4-5行中拆分时,它只忽略了第一行。
感谢任何帮助。
I have an excel file .xlsx which i will get from client and after certain rows data it will have a blank row and then a generic comment everytime and the starting work of that phrase will be same all the time and not the whole phrase.
I want to execute ssis only till row 5 including header as columns and not want to process line 6 and 7 and every time data changes so i can't even take a range in this case.
I have the flexibility do it either in on prem SSIS or ADF.
Tried using filters but as the text in the cell is splitted in 4-5 lines it only ignores the 1st line.
I would appreciate any help.


发布评论
评论(1)
使用 Azure Data Factory 数据流,您可以使用过滤器转换来根据条件忽略某些行。
示例:
excel输入:
adf数据流:
excel数据集
。如果您的第一行不是标题,请不要启用第一行作为数据集中的标题
属性。默认情况下,在阅读时将跳过空行。source
转换并将其连接到Excel数据集。如果数据中没有标题行,则将默认列名(例如 col0 )提供给您的数据。添加
过滤器
源后转换,以滤除不需要的行。expression :
不(例如({_ col0 _},'this%')))
使用 ssis ,在您的Excel源中,您可以使用SQL命令并编写查询来从Excel中提取数据。如果您有任何强制性列,请使用条件的过滤器,以便从文件中拉不为空行(例如:从[Sheet1 $]中选择 * [column1]不是null)。或使用
条件拆分
任务从Excel源过滤所需的行。Using Azure data factory data flow, you can use filter transformation to ignore certain rows based on conditions.
Example:
Excel input:
ADF Data flow:
excel dataset
. If your first row is not a header do not enable theFirst row as header
property in the dataset. Here by default, the empty rows will be skipped while reading.source
transformation and connect it to the excel dataset. If you do not have a header row in the data, the default column names (like col0) were given to your data.Add
filter
transformation after source, to filter out the unwanted rows.Expression:
not(like({_col0_},'This%'))
Filter output:
Using SSIS, in your Excel source, you can use the SQL command and write a query to pull the data from excel. If you have any mandatory columns, use filters with where condition to pull not null rows from the file (ex: SELECT * FROM [Sheet1$] WHERE [column1] IS NOT NULL). Or use the
conditional split
task to filter the required rows from the excel source.