如何解决选择多行的问题

发布于 2024-11-26 18:03:21 字数 1344 浏览 1 评论 0原文

我有这种格式的数据 - 这只是一个

例子: n=2

X      Y      info

2      1       good
2      4       bad

3      2      good

4     1       bad
4      4      good

6       2     good
6       3     good

现在,以上数据已排序(总共 7 行)。我需要分别制作一组 2 、 3 或 4 行并生成一个图表。在上面的数据中,我制作了一组 2 行。第三行被单独保留,因为第三行中没有其他列可以形成一个组。一个组只能在同一行内形成。不与其他行。

现在,我将检查这两行的信息列中是否都有“good”。如果两行都有“好”,则形成的组也好,否则不好。在上面的例子中,第三组/最后一组是“好”组。剩下的都是坏组。完成所有行后,我将计算总数。形成的良好团体数量/总数组。

在上面的示例中,输出将为:Total no。优秀团体数/总数组数 => 1/3。

这是 n=2(组大小)的情况

现在,对于 n=3,我们制作 3 行的组,对于 n=4,我们制作 4 行的组,并以类似的方式找到好/坏的组。如果组中的所有行都有“好”块,则结果是好块,否则结果是坏块。

示例:n= 3

2      1       good
2      4       bad
2     6        good

3      2      good

4     1       good
4      4      good
4    6        good

6       2     good
6       3     good

在上面的情况下,我保留了第 4 行和最后 2 行,因为我无法用它们组成 3 行组。第一组结果是“坏”,最后一组结果是“好”。
输出:1/ 2

对于 n= 4:

2      1       good
2      4       good
2      6        good
2      7       good

3      2      good

4     1       good
4      4      good
4    6        good

6       2     good
6       3     good
6       4     good
6       5     good

在这种情况下,我将 4 人分成一组并求出结果。第 5、6、7、8 行被留下或忽略。我制作了 2 组,每组 4 行,都是“好”块。 输出:2/2

因此,在获得 n=2 、 n-3 和 n=4 的 3 个输出值之后,我将绘制这些值的图表。

I have the data in this format- it is just an

example: n=2

X      Y      info

2      1       good
2      4       bad

3      2      good

4     1       bad
4      4      good

6       2     good
6       3     good

Now, the above data is in sorted manner (total 7 rows). I need to make a group of 2 , 3 or 4 rows separately and generate a graph. In the above data, I made a group of 2 rows. The third row is left alone as there is no other column in 3rd row to form a group. A group can be formed only within the same row. NOT with other rows.

Now, I will check if both the rows have “good” in the info column or not. If both rows have “good” – the group formed is also good , otherwise bad. In the above example, 3rd /last group is “good” group. Rest are all bad group. Once I’m done with all the rows, I will calculate the total no. of Good groups formed/Total no. of groups.

In the above example, the output will be: Total no. of good groups/Total no. of groups => 1/3.

This is the case of n=2(size of group)

Now, for n=3, we make group of 3 rows and for n=4, we make a group of 4 rows and find the good /bad groups in a similar way. If all the rows in a group has “good” block—the result is good block, otherwise bad.

Example: n= 3

2      1       good
2      4       bad
2     6        good

3      2      good

4     1       good
4      4      good
4    6        good

6       2     good
6       3     good

In the above case, I left the 4th row and last 2 rows as I can’t make group of 3 rows with them. The first group result is “bad” and last group result is “good”.
Output: 1/ 2

For n= 4:

2      1       good
2      4       good
2      6        good
2      7       good

3      2      good

4     1       good
4      4      good
4    6        good

6       2     good
6       3     good
6       4     good
6       5     good

In this case, I make a group of 4 and finds the result. The 5th,6th,7th,8th row are left behind or ignored. I made 2 groups of 4 rows and both are “good” blocks.
Output: 2/2

So, After getting 3 output values for n=2 , n-3, and n=4 I will plot a graph of these values.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

怪异←思 2024-12-03 18:03:21

下面是我认为得到您正在寻找的内容的代码。它假设您描述的数据分别存储在名为 data_2、data_3 和 data_4 的三个数据集中。每个数据集都由 %FIND_GOOD_GROUPS 宏处理,该宏确定哪些 X 组在 INFO 中具有所有“GOOD”值,然后将此摘要信息作为新行附加到 BASE 数据集。我没有添加代码,但您可以在单独的数据步骤中计算 GOOD_COUNT 与 FREQ 的比率,然后使用过程来绘制 N 值和比率。希望这接近您想要实现的目标。


%******************************************************************************;
%macro main;

   %find_good_groups(dsn=data_2, n=2);
   %find_good_groups(dsn=data_3, n=3);
   %find_good_groups(dsn=data_4, n=4);

   proc print data=base uniform noobs;

%mend main;
%******************************************************************************;
%******************************************************************************;
%macro find_good_groups(dsn=,n=);

   %***************************************************************************;
   %* Sort data by X and Y so that you can use FIRST.X variable in Data step. *;
   %***************************************************************************;
   proc sort data=&dsn;
      by x y;
   run;

   %***************************************************************************;
   %* TEMP dataset uses the FIRST.X variable to reset COUNT and GOOD_COUNT to *;
   %* initial values for each row where X changes. Each row in the X groups   *;
   %* adds 1 to COUNT and sets GOOD_COUNT to 0 (zero) if INFO is ever "BAD".  *;
   %* A record is output if COUNT is equal to the macro parameter &N.         *;
   %***************************************************************************;
   data temp;
      keep good_count n;
      retain count 0 good_count 1 n &n;
      set &dsn;
      by x y;
      if first.x then do;
         count = 0;
         good_count = 1;
      end;
      count = count + 1;
      if good_count eq 1 then do;
         if trim(left(upcase(info))) eq "BAD" then do;
            good_count = 0;
         end;
      end;
      if count eq &n then output;
   run;

   %***************************************************************************;
   %* Summarize the TEMP data to find the number of times that all of the     *;
   %* rows had "GOOD" in the INFO column for each value of X.                 *;
   %***************************************************************************;
   proc summary data=temp;
      id n;
      var good_count;
      output out=n_&n (drop=_type_) sum=;
   run;

   %***************************************************************************;
   %* Append to BASE dataset to retain the sums and frequencies from all of   *;
   %* the datasets. BASE can be used to plot the N / number of Good records.  *;
   %***************************************************************************;
   proc append data=n_&n base=base force; run;

%mend find_good_groups;
%******************************************************************************;
%main

Below is code that I think is getting what you are looking for. It assumes that the data that you described is stored separately in the three datasets named data_2, data_3, and data_4. Each of these datasets is processed by the %FIND_GOOD_GROUPS macro that determines which groups of X have all "GOOD" values in INFO, then this summary information is appended as a new row to the BASE dataset. I didn't add the code, but you could calculate the ratio of GOOD_COUNT to FREQ in a separate data step, then use a procedure to plot the N value and the ratio. Hope this gets close to what you're trying to accomplish.


%******************************************************************************;
%macro main;

   %find_good_groups(dsn=data_2, n=2);
   %find_good_groups(dsn=data_3, n=3);
   %find_good_groups(dsn=data_4, n=4);

   proc print data=base uniform noobs;

%mend main;
%******************************************************************************;
%******************************************************************************;
%macro find_good_groups(dsn=,n=);

   %***************************************************************************;
   %* Sort data by X and Y so that you can use FIRST.X variable in Data step. *;
   %***************************************************************************;
   proc sort data=&dsn;
      by x y;
   run;

   %***************************************************************************;
   %* TEMP dataset uses the FIRST.X variable to reset COUNT and GOOD_COUNT to *;
   %* initial values for each row where X changes. Each row in the X groups   *;
   %* adds 1 to COUNT and sets GOOD_COUNT to 0 (zero) if INFO is ever "BAD".  *;
   %* A record is output if COUNT is equal to the macro parameter &N.         *;
   %***************************************************************************;
   data temp;
      keep good_count n;
      retain count 0 good_count 1 n &n;
      set &dsn;
      by x y;
      if first.x then do;
         count = 0;
         good_count = 1;
      end;
      count = count + 1;
      if good_count eq 1 then do;
         if trim(left(upcase(info))) eq "BAD" then do;
            good_count = 0;
         end;
      end;
      if count eq &n then output;
   run;

   %***************************************************************************;
   %* Summarize the TEMP data to find the number of times that all of the     *;
   %* rows had "GOOD" in the INFO column for each value of X.                 *;
   %***************************************************************************;
   proc summary data=temp;
      id n;
      var good_count;
      output out=n_&n (drop=_type_) sum=;
   run;

   %***************************************************************************;
   %* Append to BASE dataset to retain the sums and frequencies from all of   *;
   %* the datasets. BASE can be used to plot the N / number of Good records.  *;
   %***************************************************************************;
   proc append data=n_&n base=base force; run;

%mend find_good_groups;
%******************************************************************************;
%main
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文