I am training hugging face transformer albert-base-v2 to predict the token into pos_tags, chunk_tags and ner_tags, so i have putted 3 fc layer on top of the encoder stack. I have written a collate function to pad and batch the data, before putting data to collate function i am converting it into torch tensors, but after making list of the input_ids as i put it for padding it is giving the error :- TypeError: only integer tensors of a single element can be converted to an index, this is the code,
'chunk_tags', 'ner_tags', 'input_ids'])
def get_collate_fn(pad_index):
def collate_fn(batch):
batch_ids = [torch.tensor(i["input_ids"], dtype=torch.long) for i in batch]
batch_ids = nn.utils.rnn.pad_sequence(batch_ids, batch_first=True,
padding_value=pad_index)
batch_label_1 = [i["pos_tags"] for i in batch]
batch_label_1 = torch.stack(batch_label_1)
batch_label_2 = [i["chunk_tags"] for i in batch]
batch_label_2 = torch.stack(batch_label_2)
batch_label_3 = [i["ner_tags"] for i in batch]
batch_label_3 = torch.stack(batch_label_3)
# TypeError: only integer tensors of a single element can be converted to an index
return {"input_ids" : batch_ids, "pos_tags" : batch_label_1, "chunk_tags" :
batch_label_2, "ner_tags" : batch_label_3}
return collate_fn
def get_dataloader(dataset, batch_size, pad_index, shuffle=False):
collate_fn = get_collate_fn(pad_index)
data_loader = torch.utils.data.DataLoader(
dataset = dataset,
batch_size = batch_size,
collate_fn = collate_fn,
shuffle = shuffle,
)
return data_loader```
`your text`
I tried to convert the batch_ids to torch tensor by batch_ids = torch.tensor(batch_ids), it didnt work , before i was not converting each input_ids into tensor the batch_ids was only a list , but still was not working saying TypeError: expected Tensor as element 0 in argument 0, but got list