quantaRoche commited on
Commit
3bba2f3
verified
1 Parent(s): ad97b64

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -45,6 +45,7 @@ and then finetuned the entire model on [nasa smd qa training split](https://hugg
45
 
46
 
47
  ## Finetune (1) Hyperparameters (NQ Dataset) zero-shot:
 
48
  train_batch_size = 16
49
  val_batch_size = 8
50
  n_epochs = 2
@@ -56,8 +57,10 @@ last_layer_learning_rate=5e-6
56
  qa_head_learning_rate=3e-5
57
  release_model= "roberta-finetuned-nq"
58
  gradient_checkpointing=True
 
59
 
60
  ## Finetune (2) Hyperparameters (Nasa train):
 
61
  train_batch_size = 16
62
  val_batch_size = 8
63
  n_epochs = 5
@@ -68,7 +71,8 @@ optimizer=adamW
68
  layer_learning_rate=1e-6
69
  qa_head_learning_rate=1e-5
70
  release_model= "roberta-finetuned-nq-nasa"
71
- gradient_checkpointing=True
 
72
 
73
  ## Motive of 2 finetunes
74
  Finetune 1 was already strong towards answering capability, however the model tried to answer more often than abstaining therefore finetune 2 was needed to teach the model when to abstain.
 
45
 
46
 
47
  ## Finetune (1) Hyperparameters (NQ Dataset) zero-shot:
48
+ ```
49
  train_batch_size = 16
50
  val_batch_size = 8
51
  n_epochs = 2
 
57
  qa_head_learning_rate=3e-5
58
  release_model= "roberta-finetuned-nq"
59
  gradient_checkpointing=True
60
+ ```
61
 
62
  ## Finetune (2) Hyperparameters (Nasa train):
63
+ ```
64
  train_batch_size = 16
65
  val_batch_size = 8
66
  n_epochs = 5
 
71
  layer_learning_rate=1e-6
72
  qa_head_learning_rate=1e-5
73
  release_model= "roberta-finetuned-nq-nasa"
74
+ gradient_checkpointing=True
75
+ ```
76
 
77
  ## Motive of 2 finetunes
78
  Finetune 1 was already strong towards answering capability, however the model tried to answer more often than abstaining therefore finetune 2 was needed to teach the model when to abstain.