SSIS 脚本组件删除 '\0' CHAR(n) 字段中的字符
我目前正在开发一个字段中有“\0”字符的数据库。
例如,该字段
Category CHAR(4)
有时具有值“\0\0\0\0”(4 个零字符),有时具有“ ”(4 个空白字符),
我想使用脚本组件来区分存在此问题的所有字段。 我编写了以下脚本,但它不起作用,因为 C# 将 '\0\0\0\0' 转换为空字符串。
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
Type rowType = Row.GetType();
foreach (IDTSInputColumn100 column in ComponentMetaData.InputCollection[0].InputColumnCollection)
{
PropertyInfo columnValue = rowType.GetProperty(column.Name.Replace("_", ""));
Object obj = columnValue.GetValue(Row, null);
if (obj is string)
{
string s = (string)obj;
StringBuilder sb = new StringBuilder();
foreach (char c in s)
{
if (c < ' ')
{
sb.Append(' ');
}
else
sb.Append(c);
}
columnValue.SetValue(Row, sb.ToString(), null);
}
}
}
是否可以将字段转换为字节数组而不是字符串,以便能够区分“\0”字符?
I'm currently working on a database that have '\0' characters in fields.
For instance the field
Category CHAR(4)
sometimes has value '\0\0\0\0' (4 zero characters) and sometimes ' ' (4 blank characters)
I want to use a script component to individuate all the fields with this problem.
I've written the following script, but it doesn't work since the C# converts the '\0\0\0\0' to an empty string.
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
Type rowType = Row.GetType();
foreach (IDTSInputColumn100 column in ComponentMetaData.InputCollection[0].InputColumnCollection)
{
PropertyInfo columnValue = rowType.GetProperty(column.Name.Replace("_", ""));
Object obj = columnValue.GetValue(Row, null);
if (obj is string)
{
string s = (string)obj;
StringBuilder sb = new StringBuilder();
foreach (char c in s)
{
if (c < ' ')
{
sb.Append(' ');
}
else
sb.Append(c);
}
columnValue.SetValue(Row, sb.ToString(), null);
}
}
}
Is it possible to convert the field to a byte array instead of a string, in order to be able to individuate '\0' characters?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您需要在脚本组件中执行此操作吗?您不能使用派生列任务并使用子字符串来提取每个列值吗?您也可以使用 Trim 来删除任何空格。
Do you need to do this in a script component? Could you not use a Derived Column task and use Substring to pull out each of the column values? You could use Trim as well to remove any spaces.
考虑将 char(4) 转换为二进制表示形式(使用数据转换组件),然后从那里操作各个元素。如果可能,请将其转换到源查询中,以便它在进入管道之前已经是二进制的。
Consider converting the char(4) to a binary representation (using Data Conversion component), then manipulating the individual elements from there. If possible, cast it in your source query, so that it's already binary before it enters the pipeline.