返回介绍

solution / 2800-2899 / 2882.Drop Duplicate Rows / README_EN

发布于 2024-06-17 01:02:59 字数 2191 浏览 0 评论 0 收藏 0

2882. Drop Duplicate Rows

中文文档

Description

DataFrame customers
+-------------+--------+
| Column Name | Type   |
+-------------+--------+
| customer_id | int  |
| name    | object |
| email     | object |
+-------------+--------+

There are some duplicate rows in the DataFrame based on the email column.

Write a solution to remove these duplicate rows and keep only the first occurrence.

The result format is in the following example.

 

Example 1:
Input:
+-------------+---------+---------------------+
| customer_id | name  | email         |
+-------------+---------+---------------------+
| 1       | Ella  | emily@example.com   |
| 2       | David   | michael@example.com |
| 3       | Zachary | sarah@example.com   |
| 4       | Alice   | john@example.com  |
| 5       | Finn  | john@example.com  |
| 6       | Violet  | alice@example.com   |
+-------------+---------+---------------------+
Output:  
+-------------+---------+---------------------+
| customer_id | name  | email         |
+-------------+---------+---------------------+
| 1       | Ella  | emily@example.com   |
| 2       | David   | michael@example.com |
| 3       | Zachary | sarah@example.com   |
| 4       | Alice   | john@example.com  |
| 6       | Violet  | alice@example.com   |
+-------------+---------+---------------------+
Explanation:
Alic (customer_id = 4) and Finn (customer_id = 5) both use john@example.com, so only the first occurrence of this email is retained.

Solutions

Solution 1

import pandas as pd


def dropDuplicateEmails(customers: pd.DataFrame) -> pd.DataFrame:
  return customers.drop_duplicates(subset=['email'])

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文