巨大的期望列出了总体价值总数

发布于 2025-02-03 19:30:56 字数 877 浏览 4 评论 0原文

我已经运行了很高的期望检查Expect_column_values_to_be_unique在其中一列上检查。它产生了以下结果,如下所示,有62个重复项,但在输出列表中,它仅返回20个元素。如何在该列中检索所有重复记录。 df.expect_column_values_to_be_unique('a')

  "exception_info": null,
  "expectation_config": {
    "expectation_type": "expect_column_values_to_be_unique",
    "kwargs": {
      "column": "A",
      "result_format": "BASIC"
    },
    "meta": {}
  },
  "meta": {},
  "success": false,
  "result": {
    "element_count": 100,
    "missing_count": 0,
    "missing_percent": 0.0,
    "unexpected_count": 62,
    "unexpected_percent": 62.0,
    "unexpected_percent_nonmissing": 62.0,
    "partial_unexpected_list": [
      37,
      62,
      72,
      53,
      22,
      61,
      95,
      21,
      64,
      59,
      77,
      53,
      0,
      22,
      24,
      46,
      0,
      16,
      78,
      60
    ]
  }
}

I have run Great Expectation check expect_column_values_to_be_unique check on one of the column. It produced the following result as below.Total There are 62 Duplicates but in the output list it is returning only 20 elements. How to retrieve all duplicate records in that column.
df.expect_column_values_to_be_unique('A')

  "exception_info": null,
  "expectation_config": {
    "expectation_type": "expect_column_values_to_be_unique",
    "kwargs": {
      "column": "A",
      "result_format": "BASIC"
    },
    "meta": {}
  },
  "meta": {},
  "success": false,
  "result": {
    "element_count": 100,
    "missing_count": 0,
    "missing_percent": 0.0,
    "unexpected_count": 62,
    "unexpected_percent": 62.0,
    "unexpected_percent_nonmissing": 62.0,
    "partial_unexpected_list": [
      37,
      62,
      72,
      53,
      22,
      61,
      95,
      21,
      64,
      59,
      77,
      53,
      0,
      22,
      24,
      46,
      0,
      16,
      78,
      60
    ]
  }
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

梦幻的心爱 2025-02-10 19:30:56

您当前正在将result_format作为基本传递。为了获取您要寻找的细节级别,您需要将result_format作为完成以获取意外值的完整列表。例如:

df.expect_column_values_to_be_unique(column="A", result_format="COMPLETE")

请参阅此文档有关result> result_format 。

You're currently passing result_format as BASIC. To get the level of detail you're looking for, you'll want to instead pass result_format for this Expectation as COMPLETE to get the full list of unexpected values. For example:

df.expect_column_values_to_be_unique(column="A", result_format="COMPLETE")

See this documentation for more on result_format.

路弥 2025-02-10 19:30:56

我认为您正在使用“ show 没有参数。默认情况下,这仅显示前20行。如果您希望看到更多

df.select( col("*") ).show(200,false)

I think you are using "show" without parameters. By default this only shows the first 20 rows. If you wish to see more you need to pass in how many rows you want to see: (This will show you 200 rows, and not truncate the length of the column)

df.select( col("*") ).show(200,false)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文