我的代码因 ValueError 崩溃:索引包含重复条目,使用 yaho 数据阅读器时无法重塑
代码工作得很好,但现在在这些行之后给了我这个错误:
end = dt.datetime.now()
start = dt.date(end.year - 3, end.month, end.day)
prices = reader.get_data_yahoo(tickers,start,end)['Adj Close']
我尝试升级软件包和所有内容,但它没有帮助。即使对于我之前成功下载并通过它分析的数据,代码现在也不起作用。
ValueError Traceback (most recent call last)
Input In [6], in <cell line: 3>()
1 end = dt.datetime.now()
2 start = dt.date(end.year - 3, end.month, end.day)
----> 3 prices = reader.get_data_yahoo(tickers,start,end)['Adj Close']
File C:\Python310\lib\site-packages\pandas_datareader\data.py:80, in get_data_yahoo(*args, **kwargs)
79 def get_data_yahoo(*args, **kwargs):
---> 80 return YahooDailyReader(*args, **kwargs).read()
File C:\Python310\lib\site-packages\pandas_datareader\base.py:256, in _DailyBaseReader.read(self)
254 # Or multiple symbols, (e.g., ['GOOG', 'AAPL', 'MSFT'])
255 elif isinstance(self.symbols, DataFrame):
--> 256 df = self._dl_mult_symbols(self.symbols.index)
257 else:
258 df = self._dl_mult_symbols(self.symbols)
File C:\Python310\lib\site-packages\pandas_datareader\base.py:285, in _DailyBaseReader._dl_mult_symbols(self, symbols)
283 stocks[sym] = df_na
284 if PANDAS_0230:
--> 285 result = concat(stocks, sort=True).unstack(level=0)
286 else:
287 result = concat(stocks).unstack(level=0)
File C:\Python310\lib\site-packages\pandas\core\frame.py:8413, in DataFrame.unstack(self, level, fill_value)
8351 """
8352 Pivot a level of the (necessarily hierarchical) index labels.
8353
(...)
8409 dtype: float64
8410 """
8411 from pandas.core.reshape.reshape import unstack
-> 8413 result = unstack(self, level, fill_value)
8415 return result.__finalize__(self, method="unstack")
File C:\Python310\lib\site-packages\pandas\core\reshape\reshape.py:478, in unstack(obj, level, fill_value)
476 if isinstance(obj, DataFrame):
477 if isinstance(obj.index, MultiIndex):
--> 478 return _unstack_frame(obj, level, fill_value=fill_value)
479 else:
480 return obj.T.stack(dropna=False)
File C:\Python310\lib\site-packages\pandas\core\reshape\reshape.py:501, in _unstack_frame(obj, level, fill_value)
499 def _unstack_frame(obj, level, fill_value=None):
500 if not obj._can_fast_transpose:
--> 501 unstacker = _Unstacker(obj.index, level=level)
502 mgr = obj._mgr.unstack(unstacker, fill_value=fill_value)
503 return obj._constructor(mgr)
File C:\Python310\lib\site-packages\pandas\core\reshape\reshape.py:140, in _Unstacker.__init__(self, index, level, constructor)
133 if num_cells > np.iinfo(np.int32).max:
134 warnings.warn(
135 f"The following operation may generate {num_cells} cells "
136 f"in the resulting pandas object.",
137 PerformanceWarning,
138 )
--> 140 self._make_selectors()
File C:\Python310\lib\site-packages\pandas\core\reshape\reshape.py:192, in _Unstacker._make_selectors(self)
189 mask.put(selector, True)
191 if mask.sum() < len(self.index):
--> 192 raise ValueError("Index contains duplicate entries, cannot reshape")
194 self.group_index = comp_index
195 self.mask = mask
ValueError: Index contains duplicate entries, cannot reshape
The code worked just fine but now it gives me this error after these lines:
end = dt.datetime.now()
start = dt.date(end.year - 3, end.month, end.day)
prices = reader.get_data_yahoo(tickers,start,end)['Adj Close']
I tried upgrading packages and everything but it didn't help.The code doesn't work now even for the data I previously successfully downloaded and analysied via it.
ValueError Traceback (most recent call last)
Input In [6], in <cell line: 3>()
1 end = dt.datetime.now()
2 start = dt.date(end.year - 3, end.month, end.day)
----> 3 prices = reader.get_data_yahoo(tickers,start,end)['Adj Close']
File C:\Python310\lib\site-packages\pandas_datareader\data.py:80, in get_data_yahoo(*args, **kwargs)
79 def get_data_yahoo(*args, **kwargs):
---> 80 return YahooDailyReader(*args, **kwargs).read()
File C:\Python310\lib\site-packages\pandas_datareader\base.py:256, in _DailyBaseReader.read(self)
254 # Or multiple symbols, (e.g., ['GOOG', 'AAPL', 'MSFT'])
255 elif isinstance(self.symbols, DataFrame):
--> 256 df = self._dl_mult_symbols(self.symbols.index)
257 else:
258 df = self._dl_mult_symbols(self.symbols)
File C:\Python310\lib\site-packages\pandas_datareader\base.py:285, in _DailyBaseReader._dl_mult_symbols(self, symbols)
283 stocks[sym] = df_na
284 if PANDAS_0230:
--> 285 result = concat(stocks, sort=True).unstack(level=0)
286 else:
287 result = concat(stocks).unstack(level=0)
File C:\Python310\lib\site-packages\pandas\core\frame.py:8413, in DataFrame.unstack(self, level, fill_value)
8351 """
8352 Pivot a level of the (necessarily hierarchical) index labels.
8353
(...)
8409 dtype: float64
8410 """
8411 from pandas.core.reshape.reshape import unstack
-> 8413 result = unstack(self, level, fill_value)
8415 return result.__finalize__(self, method="unstack")
File C:\Python310\lib\site-packages\pandas\core\reshape\reshape.py:478, in unstack(obj, level, fill_value)
476 if isinstance(obj, DataFrame):
477 if isinstance(obj.index, MultiIndex):
--> 478 return _unstack_frame(obj, level, fill_value=fill_value)
479 else:
480 return obj.T.stack(dropna=False)
File C:\Python310\lib\site-packages\pandas\core\reshape\reshape.py:501, in _unstack_frame(obj, level, fill_value)
499 def _unstack_frame(obj, level, fill_value=None):
500 if not obj._can_fast_transpose:
--> 501 unstacker = _Unstacker(obj.index, level=level)
502 mgr = obj._mgr.unstack(unstacker, fill_value=fill_value)
503 return obj._constructor(mgr)
File C:\Python310\lib\site-packages\pandas\core\reshape\reshape.py:140, in _Unstacker.__init__(self, index, level, constructor)
133 if num_cells > np.iinfo(np.int32).max:
134 warnings.warn(
135 f"The following operation may generate {num_cells} cells "
136 f"in the resulting pandas object.",
137 PerformanceWarning,
138 )
--> 140 self._make_selectors()
File C:\Python310\lib\site-packages\pandas\core\reshape\reshape.py:192, in _Unstacker._make_selectors(self)
189 mask.put(selector, True)
191 if mask.sum() < len(self.index):
--> 192 raise ValueError("Index contains duplicate entries, cannot reshape")
194 self.group_index = comp_index
195 self.mask = mask
ValueError: Index contains duplicate entries, cannot reshape
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我知道这可能会令人沮丧,但目前您必须单独阅读每个股票行情。自 Pandas 最新版本以来,API 可能已损坏:
输出:
I know it can be frustrating but for the moment you have to read each ticker individually. The API is probably broken since the lastest versions of Pandas:
Output:
当在周六或周日进行查询时,会出现该错误,因为雅虎财经会重复周五的数据两次。
你可以通过雅虎财经本身的历史数据来查看。
对于单个股票可以通过以下方式解决:
data = data[~data.index.duplicated(keep='last')]
但是,当下载股票列表的信息时,提出了解决方案通过迭代所述列表,然后连接该系列来构建 df。
然后你可以使用上面的代码来删除重复的索引。
The error occurs when the query is made on a Saturday or Sunday, since Yahoo Finance repeats the data for Friday twice.
You can check it by looking at the historical data in finance yahoo itself.
For a single stock can be solved with:
data = data[~data.index.duplicated(keep='last')]
But, when downloading info for a list of stocks, , the solution is proposed by iterating over said list and then concatenating the series to construct the df.
Then you can use the code above to remove the duplicate indexes.