返回介绍

solution / 3000-3099 / 3087.Find Trending Hashtags / README_EN

发布于 2024-06-17 01:02:57 字数 4223 浏览 0 评论 0 收藏 0

3087. Find Trending Hashtags

中文文档

Description

Table: Tweets

+-------------+---------+
| Column Name | Type  |
+-------------+---------+
| user_id   | int   |
| tweet_id  | int   |
| tweet_date  | date  |
| tweet     | varchar |
+-------------+---------+
tweet_id is the primary key (column with unique values) for this table.
Each row of this table contains user_id, tweet_id, tweet_date and tweet.

Write a solution to find the top 3 trending hashtags in February 2024.

Return _the result table orderd by count of hashtag, hastag in _descending_ order._

The result format is in the following example.

 

Example 1:

Input:

Tweets table:

+---------+----------+----------------------------------------------+------------+
| user_id | tweet_id | tweet                    | tweet_date |
+---------+----------+----------------------------------------------+------------+
| 135   | 13     | Enjoying a great start to the day! #HappyDay | 2024-02-01 |
| 136   | 14     | Another #HappyDay with good vibes!       | 2024-02-03 |
| 137   | 15     | Productivity peaks! #WorkLife        | 2024-02-04 |
| 138   | 16     | Exploring new tech frontiers. #TechLife    | 2024-02-04 |
| 139   | 17     | Gratitude for today's moments. #HappyDay   | 2024-02-05 |
| 140   | 18     | Innovation drives us. #TechLife        | 2024-02-07 |
| 141   | 19     | Connecting with nature's serenity. #Nature   | 2024-02-09 |
+---------+----------+----------------------------------------------+------------+
 

Output:

+-----------+--------------+
| hashtag   | hashtag_count|
+-----------+--------------+
| #HappyDay | 3      |
| #TechLife | 2      |
| #WorkLife | 1      |
+-----------+--------------+

Explanation:

  • #HappyDay: Appeared in tweet IDs 13, 14, and 17, with a total count of 3 mentions.
  • #TechLife: Appeared in tweet IDs 16 and 18, with a total count of 2 mentions.
  • #WorkLife: Appeared in tweet ID 15, with a total count of 1 mention.

Note: Output table is sorted in descending order by hashtag_count and hashtag respectively.

Solutions

Solution 1: Extract Substring + Grouping

We can query all tweets from February 2024, use the SUBSTRING_INDEX function to extract Hashtags, then use the GROUP BY and COUNT functions to count the occurrences of each Hashtag. Finally, we sort by the number of occurrences in descending order and by Hashtag in descending order, and take the top three popular Hashtags.

# Write your MySQL query statement below
SELECT
  CONCAT('#', SUBSTRING_INDEX(SUBSTRING_INDEX(tweet, '#', -1), ' ', 1)) AS hashtag,
  COUNT(1) AS hashtag_count
FROM Tweets
WHERE DATE_FORMAT(tweet_date, '%Y%m') = '202402'
GROUP BY 1
ORDER BY 2 DESC, 1 DESC
LIMIT 3;
import pandas as pd


def find_trending_hashtags(tweets: pd.DataFrame) -> pd.DataFrame:
  tweets = tweets[tweets["tweet_date"].dt.strftime("%Y%m") == "202402"]
  tweets["hashtag"] = "#" + tweets["tweet"].str.extract(r"#(\w+)")
  hashtag_counts = tweets["hashtag"].value_counts().reset_index()
  hashtag_counts.columns = ["hashtag", "hashtag_count"]
  hashtag_counts = hashtag_counts.sort_values(
    by=["hashtag_count", "hashtag"], ascending=[False, False]
  )
  top_3_hashtags = hashtag_counts.head(3)
  return top_3_hashtags

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文