在训练Bert变体时,获得IndexError:index超出自我范围
xlm_r_model(input_ids = X_train_batch_input_ids
, attention_mask = X_train_batch_attention_mask
, return_dict = False
)
训练 面对以下错误:
Traceback (most recent call last):
File "<string>", line 3, in <module>
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/transformers/models/roberta/modeling_roberta.py", line 1218, in forward
return_dict=return_dict,
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/transformers/models/roberta/modeling_roberta.py", line 849, in forward
past_key_values_length=past_key_values_length,
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/transformers/models/roberta/modeling_roberta.py", line 132, in forward
inputs_embeds = self.word_embeddings(input_ids)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/sparse.py", line 160, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 2044, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self
以下是详细信息:
创建模型
config = xlmrobertaconfig() config.output_hidden_states = false XLM_R_MODEL = XLMROBERTAFORSECECECECECELACICY(config = config) xlm_r_model.to(设备)#设备是设备(type ='cpu')
令牌
xlmr_tokenizer = xlmrobertatokenizer.from_pretrataining('xlm-roberta-large') max_tweet_len = 402 &gt;&gt;&gt; df_1000.info()#描述我已经填充的数据框架 &lt; class'pandas.core.frame.dataframe'&gt; INT64Index:1000条条目,29639至44633 数据列(总计2列): #列非零计数dtype ------------------------------------------- 0文本1000非零对象 1级1000非无效INT64 dtypes:int64(1),对象(1) 内存使用率:55.7+ kb x_train = xlmr_tokenizer(list(df_1000 [:800] .text),padding = true,max_length = max_tweet_len +5,truncation = truncation = true)#+5:特殊令牌 /隔板的头部空间 &gt;&gt;&gt;列表(MAP(LEN,X_TRAIN ['input_ids']))#为什么它的105?不应该是max_tweet_len+5 = 407吗? [105、105、105、105、105、105、105、105、105、105、105、105、105、105、105,...] &gt;&gt;&gt;类型(train_index)#描述(为清晰)训练折叠索引我预先填充 &lt;类'numpy.ndarray'&gt; &gt;&gt;&gt; train_index.size 640 x_train_fold_input_ids = np.array(x_train ['input_ids'])[train_index] x_train_fold_attention_mask = np.array(x_train ['activation_mask'])[train_index] &gt;&gt;&gt;我#批次ID 0 &gt;&gt;&gt; batch_size 16 x_train_batch_input_ids = x_train_fold_input_ids [i:i+batch_size] x_train_batch_input_ids = torch.tensor(x_train_batch_input_ids,dtype = type = torch.long).to(设备) x_train_batch_attention_mask = x_train_fold_attention_mask [i:i+batch_size] x_train_batch_attention_mask = torch.tensor(x_train_batch_attention_mask,dtype = type = torch.long).to(设备) &gt;&gt;&gt; x_train_batch_input_ids.size() TORCH.SIZE([16,105])#为什么105?这不应该是max_tweet_len+5 = 407吗? &gt;&gt;&gt; x_train_batch_attention_mask.size() TORCH.SIZE([16,105])#为什么105?这不应该是max_tweet_len+5 = 407吗?
之后,我将调用xlm_r_model(...)
,如本问题开头所述,最终出现指定的错误。
注意所有这些细节,我仍然无法明白为什么我会遇到指定的错误。我在哪里做错了?
While training XLMRobertaForSequenceClassification
:
xlm_r_model(input_ids = X_train_batch_input_ids
, attention_mask = X_train_batch_attention_mask
, return_dict = False
)
I faced following error:
Traceback (most recent call last):
File "<string>", line 3, in <module>
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/transformers/models/roberta/modeling_roberta.py", line 1218, in forward
return_dict=return_dict,
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/transformers/models/roberta/modeling_roberta.py", line 849, in forward
past_key_values_length=past_key_values_length,
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/transformers/models/roberta/modeling_roberta.py", line 132, in forward
inputs_embeds = self.word_embeddings(input_ids)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/sparse.py", line 160, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 2044, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self
Below are details:
Creating model
config = XLMRobertaConfig() config.output_hidden_states = False xlm_r_model = XLMRobertaForSequenceClassification(config=config) xlm_r_model.to(device) # device is device(type='cpu')
Tokenizer
xlmr_tokenizer = XLMRobertaTokenizer.from_pretrained('xlm-roberta-large') MAX_TWEET_LEN = 402 >>> df_1000.info() # describing a data frame I have pre populated <class 'pandas.core.frame.DataFrame'> Int64Index: 1000 entries, 29639 to 44633 Data columns (total 2 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 text 1000 non-null object 1 class 1000 non-null int64 dtypes: int64(1), object(1) memory usage: 55.7+ KB X_train = xlmr_tokenizer(list(df_1000[:800].text), padding=True, max_length=MAX_TWEET_LEN+5, truncation=True) # +5: a head room for special tokens / separators >>> list(map(len,X_train['input_ids'])) # why its 105? shouldn't it be MAX_TWEET_LEN+5 = 407? [105, 105, 105, 105, 105, 105, 105, 105, 105, 105, 105, 105, 105, 105, ...] >>> type(train_index) # describing (for clarity) training fold indices I pre populated <class 'numpy.ndarray'> >>> train_index.size 640 X_train_fold_input_ids = np.array(X_train['input_ids'])[train_index] X_train_fold_attention_mask = np.array(X_train['attention_mask'])[train_index] >>> i # batch id 0 >>> batch_size 16 X_train_batch_input_ids = X_train_fold_input_ids[i:i+batch_size] X_train_batch_input_ids = torch.tensor(X_train_batch_input_ids,dtype=torch.long).to(device) X_train_batch_attention_mask = X_train_fold_attention_mask[i:i+batch_size] X_train_batch_attention_mask = torch.tensor(X_train_batch_attention_mask,dtype=torch.long).to(device) >>> X_train_batch_input_ids.size() torch.Size([16, 105]) # why 105? Shouldnt this be MAX_TWEET_LEN+5 = 407? >>> X_train_batch_attention_mask.size() torch.Size([16, 105]) # why 105? Shouldnt this be MAX_TWEET_LEN+5 = 407?
After this I make the call xlm_r_model(...)
as stated at the beginning of this question and ending up with the specified error.
Noticing all these details, I am still not able to get why I am getting the specified error. Where I am doing it wrong?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
根据这篇文章的 github ,可能有很多原因。以下是从该职位总结的原因列表(截至2022年4月24日,请注意,未测试第二和第三个原因):
”原因,词汇大小不匹配,我已经解决了以下内容:
这是我修复的方式:
As per this post on github, there can be possibly many reasons for this. Below is the list of reasons summmarised from that post (as of April 24, 2022, note that 2nd and 3rd reasons are not tested):
In my case, it was the first reason, mismatching vocab size and I have fixed this as follows:
Here is how I fixed this:
我遇到了同样的问题,我通过替换型模型模型名称的模型路径来解决它。 (从“/路径/到/局部/模型”到“ bert-base-phinese”)。
I have the same issue and I solved it by replace the model path to huggingface model name. (from "/path/to/local/model" to "bert-base-chinese").