比较来自不同数据集SAS的数据
我有两个表table_a和table_b。
table_a
queue_nm | cust_nm |
---|---|
abcd | veri |
pqrs | rstq |
table_b
queue_nm | act_cust_nm |
---|---|
pqrs pqrs | tmob |
我想比较两个数据集,如果它们不匹配,则在saS中替换saas中的table_b in table_a中的数据,
data compare_test_2;
set table_A;
set table_B;
if table_A.Queue_nm = table_B.Queue_nm
and tableA.Cust_nm != table_B.Act_Cust_name
then do;
tableA.Cust_nm = table_B.Act_Cust_name
where table_A.Queue_nm = table_B.Queue_nm;
run;
我希望使用以下输出后,我希望在比较之后输出
。
queue_nm | cust_nm |
---|---|
abcd | veri |
pqrs | tmob |
i得到以下错误
ERROR: DATA STEP Component Object failure. Aborted during the COMPILATION phase.
ERROR 557-185: Variable table_A is not an object.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以使用SQL使用密钥变量通过相关子查询来更新来自第二个表的值的表。
示例:
更新到位表的另一种方法是
修改
语句。You can use SQL to update a table with values from a second table via a correlated subquery using a key variable.
Example:
Another way to update a table in place is the
MODIFY
statement.您尚未书写有效的SAS数据步骤代码。您的代码更像SQL。校正的代码在下面,但它可以变得比这更简单。
在数据步骤中,
in =
选项等于a.queue_nm
在SQL中,因为我们是通过queue_nm
合并的。将其视为隐含的版本。我们可以通过重命名
ACT_CUST_NM
将其进一步简化,CUST_NM
intable> table_b
并覆盖CUST_NM
Intable_a
带有table_b
的值。由于table_b
在Merge语句中排名第二,因此它将覆盖QUEUE_NM
intable> table_a
中的匹配值。当您有一对一或多一对众多的合并时,其工作原理的规则会变得更加复杂。我强烈建议您阅读有关其工作原理的更多信息:
You have not written valid SAS data step code. Your code is more SQL-like. The corrected code is below, but it can get even simpler than this.
In the data step, the
in=
option is equivalent toa.queue_nm
in SQL since we are merging byqueue_nm
. Think of it like an implied version of that.We can simplify this even further by renaming
act_cust_nm
tocust_nm
intable_b
and overwriting any instance ofcust_nm
intable_a
with the value fromtable_b
. Sincetable_b
is second in the merge statement, it will overwrite matching values ofqueue_nm
intable_a
.The rules of how this works gets a little more complex when you have a one-to-many or many-to-many merge. I highly recommend reading more about how that works:
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsref/n1i8w2bwu1fn5kn1gpxj18xttbb0.htm
只是合并。确保两个变量在两个数据集中使用相同的名称。如果需要,则可以使用数据集选项将变量重命名为数据步骤。
如果您不想保留仅来自TableB的观察值,则使用In =数据集选项创建一个变量,该变量将指示tablea是否有助于观察。
如果两个数据集中每个Queue_NM都有多个观察值,则使用合并将无效。
Just merge. Make sure that both variables use the same name in both datasets. If you need to you can use dataset option to rename variables before they make it into the data step.
If you do not want to keep observations that only came from TABLEB then use the IN= dataset option to create a variable that will indicate if TABLEA is contributing to the observation.
If you have multiple observations per QUEUE_NM in either dataset then using MERGE will not work.