Python相关指数

发布于 2025-01-16 17:20:18 字数 2631 浏览 1 评论 0原文

给定数据框“df”,我需要获取 Region =“California”的平均价格和总交易量之间的相关性指数。

给定数据框: 输入图片此处描述

加州平均价格与总成交量之间的相关指数:

cali_mean = df.groupby('Region').get_group('California')['AveragePrice'].mean()
max_volume = (df.groupby('Region')['TotalVolume'].sum()).max() #Output: 1028981653.17

# Correlation index between California mean price and total volume
df[cali_mean].corr(df['max_volume'])

当我尝试确定加州平均价格与总成交量之间的相关指数时,我得到了以下错误消息。有办法解决这个问题吗?

错误消息

KeyError                                  Traceback (most recent call last)
~/opt/miniconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3620             try:
-> 3621                 return self._engine.get_loc(casted_key)
   3622             except KeyError as err:

~/opt/miniconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

~/opt/miniconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 1.3939644970414187

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
/var/folders/wv/42dn23fd1cb0czpvqdnb6zw00000gn/T/ipykernel_18660/3247367876.py in <module>
      1 # Correlation index between California mean price and total volume
----> 2 df[cali_mean].corr(df['max_volume'])

~/opt/miniconda3/lib/python3.9/site-packages/pandas/core/frame.py in __getitem__(self, key)
   3503             if self.columns.nlevels > 1:
   3504                 return self._getitem_multilevel(key)
-> 3505             indexer = self.columns.get_loc(key)
   3506             if is_integer(indexer):
   3507                 indexer = [indexer]

~/opt/miniconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3621                 return self._engine.get_loc(casted_key)
   3622             except KeyError as err:
-> 3623                 raise KeyError(key) from err
   3624             except TypeError:
   3625                 # If we have a listlike key, _check_indexing_error will raise

KeyError: 1.3939644970414187

Given a data frame "df", I need to obtain the correlation index between mean price and total volume for Region = "California".

Given Dataframe:
enter image description here

Correlation index between California mean price and total volume:

cali_mean = df.groupby('Region').get_group('California')['AveragePrice'].mean()
max_volume = (df.groupby('Region')['TotalVolume'].sum()).max() #Output: 1028981653.17

# Correlation index between California mean price and total volume
df[cali_mean].corr(df['max_volume'])

When I tried determining the correlation index between California's mean price and total volume, I got the following error message. Is there a way to fix this?

Error message

KeyError                                  Traceback (most recent call last)
~/opt/miniconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3620             try:
-> 3621                 return self._engine.get_loc(casted_key)
   3622             except KeyError as err:

~/opt/miniconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

~/opt/miniconda3/lib/python3.9/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 1.3939644970414187

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
/var/folders/wv/42dn23fd1cb0czpvqdnb6zw00000gn/T/ipykernel_18660/3247367876.py in <module>
      1 # Correlation index between California mean price and total volume
----> 2 df[cali_mean].corr(df['max_volume'])

~/opt/miniconda3/lib/python3.9/site-packages/pandas/core/frame.py in __getitem__(self, key)
   3503             if self.columns.nlevels > 1:
   3504                 return self._getitem_multilevel(key)
-> 3505             indexer = self.columns.get_loc(key)
   3506             if is_integer(indexer):
   3507                 indexer = [indexer]

~/opt/miniconda3/lib/python3.9/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3621                 return self._engine.get_loc(casted_key)
   3622             except KeyError as err:
-> 3623                 raise KeyError(key) from err
   3624             except TypeError:
   3625                 # If we have a listlike key, _check_indexing_error will raise

KeyError: 1.3939644970414187

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

oО清风挽发oО 2025-01-23 17:20:18

请注意,相关性是两个向量的度量。所以你可以使用:

df = pd.read_csv('avocado.csv')
temp = df[df['Region']=='California']
temp['AveragePrice'].corr(temp['TotalVolume'])

输出:

-0.7913852550045145

Note that the correlation is a measure of two vectors. So you can use:

df = pd.read_csv('avocado.csv')
temp = df[df['Region']=='California']
temp['AveragePrice'].corr(temp['TotalVolume'])

Output:

-0.7913852550045145

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文