XiaoEnn
/

herberta_seq_512_V2

@@ -1,22 +1,24 @@
 ---
 license: apache-2.0
 ---
-# Herberta: A Pretrained Model for TCM Herbal Medicine and Downstream Tasks
-**Tags**:
-- Pretrain_Model
-- transformers
-- TCM
-- herberta
-- text embedding
-**License**: Apache-2.0
-**Inference**: true
-**Language**: zh, en
-**Base Model**: hfl/chinese-roberta-wwm-ext
-**Library Name**: transformers
----
 ## Introduction
@@ -55,18 +57,6 @@ We named the model "Herberta" by combining "Herb" and "Roberta" to signify its p
 ![Loss](https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/BJ7enbRg13IYAZuxwraPP.png)
 ![Perplexity](https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/lOohRMIctPJZKM5yEEcQ2.png)
-<!-- <table>
-  <tr>
-    <td align="center"><strong>Accuracy</strong></td>
-    <td align="center"><strong>Loss</strong></td>
-    <td align="center"><strong>Perplexity</strong></td>
-  </tr>
-  <tr>
-    <td><img src="https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/RDgI-0Ro2kMiwV853Wkgx.png" alt="Accuracy" width="800"></td>
-    <td><img src="https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/BJ7enbRg13IYAZuxwraPP.png" alt="Loss" width="800"></td>
-    <td><img src="https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/lOohRMIctPJZKM5yEEcQ2.png" alt="Perplexity" width="800"></td>
-  </tr>
-</table> -->
 ### Pretraining Configuration
@@ -77,21 +67,6 @@ We named the model "Herberta" by combining "Herb" and "Roberta" to signify its p
 - Learning Rate: `1e-5` with an epoch-based decay (`epoch * 0.1`)
 - Tokenization: Sentence-based tokenization with padding for sequences <512 tokens.
-#### Modern Textbooks
-- Pretraining Strategy: Dynamic MASK + Warmup + Linear Decay
-- Sequence Length: 512
-- Batch Size: 16
-- Learning Rate: Warmup (10% steps) + Linear Decay (1e-5 initial rate)
-- Tokenization: Continuous tokenization (512 tokens) without sentence segmentation.
-#### V4 Mixed Dataset (Ancient + Modern)
-- Dataset: Combined 48 modern textbooks + 700 ancient books
-- Pretraining Strategy: Dynamic MASK, warmup, and linear decay (1e-5 learning rate).
-- Epochs: 20
-- Sequence Length: 512
-- Batch Size: 16
-- Tokenization: Continuous tokenization.
 ---
 ## Downstream Task: TCM Pattern Classification

 ---
+tags:
+- PretrainModel
+- TCM
+- transformer
+- herberta
+- text-embedding
 license: apache-2.0
+language:
+- zh
+- en
+metrics:
+- accuracy
+base_model:
+- hfl/chinese-roberta-wwm-ext-large
+new_version: XiaoEnn/herberta_seq_512_V2
+inference: true
+library_name: transformers
 ---
+# Herberta: A Pretrained Model for TCM Herbal Medicine and Downstream Tasks
 ## Introduction
 ![Loss](https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/BJ7enbRg13IYAZuxwraPP.png)
 ![Perplexity](https://cdn-uploads.huggingface.co/production/uploads/6564baaa393bae9c194fc32e/lOohRMIctPJZKM5yEEcQ2.png)
 ### Pretraining Configuration
 - Learning Rate: `1e-5` with an epoch-based decay (`epoch * 0.1`)
 - Tokenization: Sentence-based tokenization with padding for sequences <512 tokens.
 ---
 ## Downstream Task: TCM Pattern Classification