quantaRoche
/

roberta-base-finetuned-nq-nasa-qa

Question Answering

Model card Files Files and versions

quantaRoche commited on 23 days ago

Commit

3bba2f3

·

verified ·

1 Parent(s): ad97b64

Update README.md

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -45,6 +45,7 @@ and then finetuned the entire model on [nasa smd qa training split](https://hugg
 ## Finetune (1) Hyperparameters (NQ Dataset) zero-shot:
 train_batch_size = 16
 val_batch_size = 8
 n_epochs = 2
@@ -56,8 +57,10 @@ last_layer_learning_rate=5e-6
 qa_head_learning_rate=3e-5
 release_model= "roberta-finetuned-nq"
 gradient_checkpointing=True
 ## Finetune (2) Hyperparameters (Nasa train):
 train_batch_size = 16
 val_batch_size = 8
 n_epochs = 5
@@ -68,7 +71,8 @@ optimizer=adamW
 layer_learning_rate=1e-6
 qa_head_learning_rate=1e-5
 release_model= "roberta-finetuned-nq-nasa"
-gradient_checkpointing=True
 ## Motive of 2 finetunes
 Finetune 1 was already strong towards answering capability, however the model tried to answer more often than abstaining therefore finetune 2 was needed to teach the model when to abstain.

 ## Finetune (1) Hyperparameters (NQ Dataset) zero-shot:
+```
 train_batch_size = 16
 val_batch_size = 8
 n_epochs = 2
 qa_head_learning_rate=3e-5
 release_model= "roberta-finetuned-nq"
 gradient_checkpointing=True
+```
 ## Finetune (2) Hyperparameters (Nasa train):
+```
 train_batch_size = 16
 val_batch_size = 8
 n_epochs = 5
 layer_learning_rate=1e-6
 qa_head_learning_rate=1e-5
 release_model= "roberta-finetuned-nq-nasa"
+gradient_checkpointing=True
+```
 ## Motive of 2 finetunes
 Finetune 1 was already strong towards answering capability, however the model tried to answer more often than abstaining therefore finetune 2 was needed to teach the model when to abstain.