不确定为什么条件列变换没有识别要更新的值(text.contains)
我有一个人群记录,其中包含各种人口统计信息(种族,种族,性别等)。
对于null
[族裔]中的值(即西班牙裔Y/n),我想为字符串“非西班牙裔”搜索相应的[race]值,因为对于某些记录作为[race]下的合并价值(例如,“白色(非西班牙裔)”
)。我想清洁/归一化两个字段([竞赛]在下游单独的步骤中清洁)。
但是,我不确定为什么我的代码没有成功地识别前两个条件的匹配,因为我知道有很多实例“白色(非西班牙裔)”
至少:
cleanData =
Table.ReplaceValue(rawData, each [Ethnicity], each
if [Ethnicity] = null and (
Text.Contains([Race],"non-Hispanic", Comparer.OrdinalIgnoreCase) or
Text.Contains([Race],"not Hispanic", Comparer.OrdinalIgnoreCase))
then "Non-hispanic" else
if [Ethnicity] = null and
Text.Contains([Race], "hispanic", Comparer.OrdinalIgnoreCase)
then "Hispanic" else
[Ethnicity], Replacer.ReplaceText, {"Ethnicity"}
),
这两个字段是类型文本,我不会遇到错误 - 只是缺乏预期行为。 [族裔]中的null
值不变。
样本输入:
种族 | 种族 |
---|---|
白色 | 是 |
亚洲 | 否 |
白色(非西班牙裔) | 拒绝做出反应 |
白色(非西班牙裔) | null |
white(西班牙 | null |
裔 | ) /em> |
样本输出:
种族 | 种族 |
---|---|
白色 | 是 |
亚洲 | 否 |
白色(非西班牙裔) | 拒绝做出反应 |
(非西班牙裔) | no |
白色(西班牙裔) | 是是是 |
亚洲人 | 是 > null |
I have a table of people records with various demographic information (Race, Ethnicity, Gender etc.).
For null
values in [Ethnicity] (i.e., Hispanic Y/N), I want to search the corresponding [Race] value for a string "non-Hispanic", since for some records these have been stored as a combined value under [Race] (e.g., "White (non-Hispanic)"
). and I'd like to clean/normalize both fields ([Race] is cleaned in a separate downstream step).
However, I'm unsure why my code is not successfully identifying matches to the first two conditions, since I know there are many instances of "White (non-Hispanic)"
at the very least:
cleanData =
Table.ReplaceValue(rawData, each [Ethnicity], each
if [Ethnicity] = null and (
Text.Contains([Race],"non-Hispanic", Comparer.OrdinalIgnoreCase) or
Text.Contains([Race],"not Hispanic", Comparer.OrdinalIgnoreCase))
then "Non-hispanic" else
if [Ethnicity] = null and
Text.Contains([Race], "hispanic", Comparer.OrdinalIgnoreCase)
then "Hispanic" else
[Ethnicity], Replacer.ReplaceText, {"Ethnicity"}
),
Both fields are type Text, and I'm not hitting an error - just a lack of expected behavior. The null
values in [Ethnicity] are unchanged.
Sample input:
Race | Ethnicity |
---|---|
White | Yes |
Asian | No |
White (non-Hispanic) | Decline to respond |
White (non-Hispanic) | null |
White (Hispanic) | null |
Asian | null |
Sample output:
Race | Ethnicity |
---|---|
White | Yes |
Asian | No |
White (non-Hispanic) | Decline to Respond |
White (non-Hispanic) | No |
White (Hispanic) | Yes |
Asian | null |
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
它的替代品
。
Its the Replacer.ReplaceText which should be Replacer.ReplaceValue
or
@horseyride您可以抽象它以使用任何大小的列表
,然后您可以写
我是使用
table.addcolumn
的另一种方法,然后重命名。这听起来比使用替换列(就位)更糟糕。测试后,我使用两种方法获得了相同的最终折叠SQL查询。@horseyride you can abstract it to use a list of any size
then you can write
I here's another way using
Table.AddColumn
, then rename. That might sound worse than using replacing a column, in-place. After testing, I was getting the same final folded SQL query -- using either method.