比较来自不同数据集SAS的数据

发布于 2025-02-07 19:14:06 字数 1151 浏览 2 评论 0 原文

我有两个表table_a和table_b。

table_a

queue_nm cust_nm
abcd veri
pqrs rstq

table_b

queue_nm act_cust_nm
pqrs pqrs tmob

我想比较两个数据集,如果它们不匹配,则在saS中替换saas中的table_b in table_a中的数据,

data compare_test_2;
set table_A;
set table_B;

if table_A.Queue_nm = table_B.Queue_nm
and tableA.Cust_nm != table_B.Act_Cust_name
then do;
tableA.Cust_nm = table_B.Act_Cust_name
where table_A.Queue_nm = table_B.Queue_nm;
run;

我希望使用以下输出后,我希望在比较之后输出

queue_nm cust_nm
abcd veri
pqrs tmob

i得到以下错误

ERROR: DATA STEP Component Object failure.  Aborted during the COMPILATION phase.
ERROR 557-185: Variable table_A is not an object.

I have two tables table_A and table_B.

table_A

Queue_nm Cust_nm
ABCD VERI
PQRS RSTQ

table_B

Queue_nm Act_Cust_nm
PQRS TMOB

I want to compare the two datasets and if they don't match, then replace the data in table_B in table_A using SAS

data compare_test_2;
set table_A;
set table_B;

if table_A.Queue_nm = table_B.Queue_nm
and tableA.Cust_nm != table_B.Act_Cust_name
then do;
tableA.Cust_nm = table_B.Act_Cust_name
where table_A.Queue_nm = table_B.Queue_nm;
run;

I want the following output after the comparison and the data step

table_A

Queue_nm Cust_nm
ABCD VERI
PQRS TMOB

I get the following error

ERROR: DATA STEP Component Object failure.  Aborted during the COMPILATION phase.
ERROR 557-185: Variable table_A is not an object.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

桃扇骨 2025-02-14 19:14:06

您可以使用SQL使用密钥变量通过相关子查询来更新来自第二个表的值的表。

示例:

proc sql;
  update table_a as outer
  set cust_nm = (select act_cust_nm from table_b as inner where inner.Queue_nm = outer.Queue_nm)
  where exists (select * from table_b as inner where inner.Queue_nm = outer.Queue_nm)
  ;

更新到位表的另一种方法是修改语句。

proc datasets nolist lib=work;
  modify table_a;
  create index Queue_nm;
quit;

data table_a;
  set table_b;
  modify table_a key=Queue_nm;
  cust_num = act_cust_num;
run;

You can use SQL to update a table with values from a second table via a correlated subquery using a key variable.

Example:

proc sql;
  update table_a as outer
  set cust_nm = (select act_cust_nm from table_b as inner where inner.Queue_nm = outer.Queue_nm)
  where exists (select * from table_b as inner where inner.Queue_nm = outer.Queue_nm)
  ;

Another way to update a table in place is the MODIFY statement.

proc datasets nolist lib=work;
  modify table_a;
  create index Queue_nm;
quit;

data table_a;
  set table_b;
  modify table_a key=Queue_nm;
  cust_num = act_cust_num;
run;

黑寡妇 2025-02-14 19:14:06

您尚未书写有效的SAS数据步骤代码。您的代码更像SQL。校正的代码在下面,但它可以变得比这更简单。

data want;
    merge table_a(in=a)
          table_b(in=b)
    ;
    by queue_nm;

    if(a = b AND cust_nm NE act_cust_nm) then cust_nm = act_cust_nm;

    drop act_cust_nm;
run;

在数据步骤中, in = 选项等于 a.queue_nm 在SQL中,因为我们是通过 queue_nm 合并的。将其视为隐含的版本。

我们可以通过重命名 ACT_CUST_NM 将其进一步简化, CUST_NM in table> table_b 并覆盖 CUST_NM In table_a 带有 table_b 的值。由于 table_b 在Merge语句中排名第二,因此它将覆盖 QUEUE_NM in table> table_a 中的匹配值。

data want;
    merge table_a
          table_b(rename=(act_cust_nm = cust_nm) )
    ;
    by queue_nm;
run;

当您有一对一或多一对众多的合并时,其工作原理的规则会变得更加复杂。我强烈建议您阅读有关其工作原理的更多信息:

You have not written valid SAS data step code. Your code is more SQL-like. The corrected code is below, but it can get even simpler than this.

data want;
    merge table_a(in=a)
          table_b(in=b)
    ;
    by queue_nm;

    if(a = b AND cust_nm NE act_cust_nm) then cust_nm = act_cust_nm;

    drop act_cust_nm;
run;

In the data step, the in= option is equivalent to a.queue_nm in SQL since we are merging by queue_nm. Think of it like an implied version of that.

We can simplify this even further by renaming act_cust_nm to cust_nm in table_b and overwriting any instance of cust_nm in table_a with the value from table_b. Since table_b is second in the merge statement, it will overwrite matching values of queue_nm in table_a.

data want;
    merge table_a
          table_b(rename=(act_cust_nm = cust_nm) )
    ;
    by queue_nm;
run;

The rules of how this works gets a little more complex when you have a one-to-many or many-to-many merge. I highly recommend reading more about how that works:

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsref/n1i8w2bwu1fn5kn1gpxj18xttbb0.htm

倾其所爱 2025-02-14 19:14:06

只是合并。确保两个变量在两个数据集中使用相同的名称。如果需要,则可以使用数据集选项将变量重命名为数据步骤。

 data want;
   merge tablea tableb(rename=(Act_Cust_nm=Cust_nm));
   by queue_nm;
 run;

如果您不想保留仅来自TableB的观察值,则使用In =数据集选项创建一个变量,该变量将指示tablea是否有助于观察。

 data want;
   merge tablea(in=ina) tableb;
   by queue_nm;
   if ina;
 run;

如果两个数据集中每个Queue_NM都有多个观察值,则使用合并将无效。

Just merge. Make sure that both variables use the same name in both datasets. If you need to you can use dataset option to rename variables before they make it into the data step.

 data want;
   merge tablea tableb(rename=(Act_Cust_nm=Cust_nm));
   by queue_nm;
 run;

If you do not want to keep observations that only came from TABLEB then use the IN= dataset option to create a variable that will indicate if TABLEA is contributing to the observation.

 data want;
   merge tablea(in=ina) tableb;
   by queue_nm;
   if ina;
 run;

If you have multiple observations per QUEUE_NM in either dataset then using MERGE will not work.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文