Update README.md
Browse files
README.md
CHANGED
|
@@ -96,30 +96,32 @@ Please note that for certain benchmarks involving LLM-based evaluation (e.g., LL
|
|
| 96 |
| InfoVQA_TEST | 66.9 | **71.7** | 65.0 |
|
| 97 |
| ***AVERAGE*** | *70.5* | **71.7** | 68.3 |
|
| 98 |
|
| 99 |
-
###
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 100 |
| Benchmark | InternVL3-2B | Ovis2-2B | VARCO-VISION-2.0-1.7B |
|
| 101 |
| :--------------: | :----------: | :------: | :-------------------: |
|
| 102 |
| K-Viscuit | *60.0* | **64.1** | 57.7 |
|
| 103 |
| PangeaBench (ko) | **66.2** | 63.1 | *63.8* |
|
| 104 |
-
|
|
| 105 |
-
|
| 106 |
-
### Text-only Benchmark
|
| 107 |
-
| Benchmark | InternVL3-2B | Ovis2-2B | VARCO-VISION-2.0-1.7B |
|
| 108 |
-
| :--------: | :----------: | :------: | :-------------------: |
|
| 109 |
-
| MMLU | **59.9** | 12.9 | *55.3* |
|
| 110 |
-
| MT-Bench | *6.28* | 6.14 | **7.23** |
|
| 111 |
-
| KMMLU | **38.0** | *31.1* | 10.4 |
|
| 112 |
-
| KoMT-Bench | 2.91 | *3.44* | **5.91** |
|
| 113 |
-
| LogicKor | 2.56 | *3.12* | **5.37** |
|
| 114 |
-
|
| 115 |
-
> **Note:** Some models show unusually low performance on the MMLU benchmark. This is primarily due to their failure to correctly follow the expected output format when only few-shot exemplars are provided in the prompts. Please take this into consideration when interpreting the results.
|
| 116 |
|
| 117 |
### OCR Benchmark
|
| 118 |
-
| Benchmark
|
| 119 |
-
|
|
| 120 |
-
| CORD
|
| 121 |
-
| ICDAR2013
|
| 122 |
-
| ICDAR2015
|
|
|
|
| 123 |
|
| 124 |
## Usage
|
| 125 |
To use this model, we recommend installing `transformers` version **4.53.1 or higher**. While it may work with earlier versions, using **4.53.1 or above is strongly recommended**, especially to ensure optimal performance for the **multi-image feature**.
|
|
|
|
| 96 |
| InfoVQA_TEST | 66.9 | **71.7** | 65.0 |
|
| 97 |
| ***AVERAGE*** | *70.5* | **71.7** | 68.3 |
|
| 98 |
|
| 99 |
+
### Text-only Benchmark
|
| 100 |
+
| Benchmark | InternVL3-2B | Ovis2-2B | VARCO-VISION-2.0-1.7B |
|
| 101 |
+
| :-------------: | :----------: | :------: | :-------------------: |
|
| 102 |
+
| MMLU | **59.9** | 12.9 | *55.3* |
|
| 103 |
+
| MT-Bench | *62.8* | 61.4 | **72.3** |
|
| 104 |
+
| KMMLU | **38.0** | *31.1* | 10.4 |
|
| 105 |
+
| KoMT-Bench | 29.1 | *34.4* | **59.1** |
|
| 106 |
+
| LogicKor | 25.6 | *31.2* | **53.7** |
|
| 107 |
+
| ***AVERAGE*** | *43.1* | 34.2 | **50.2** |
|
| 108 |
+
|
| 109 |
+
> **Note:** Some models show unusually low performance on the MMLU benchmark. This is primarily due to their failure to correctly follow the expected output format when only few-shot exemplars are provided in the prompts. Please take this into consideration when interpreting the results.
|
| 110 |
+
|
| 111 |
+
### Korean Cultural Benchmark
|
| 112 |
| Benchmark | InternVL3-2B | Ovis2-2B | VARCO-VISION-2.0-1.7B |
|
| 113 |
| :--------------: | :----------: | :------: | :-------------------: |
|
| 114 |
| K-Viscuit | *60.0* | **64.1** | 57.7 |
|
| 115 |
| PangeaBench (ko) | **66.2** | 63.1 | *63.8* |
|
| 116 |
+
| ***AVERAGE*** | *63.1* | **63.6** | 60.8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 117 |
|
| 118 |
### OCR Benchmark
|
| 119 |
+
| Benchmark | PaddleOCR | EasyOCR | VARCO-VISION-2.0-1.7B |
|
| 120 |
+
| :-----------: | :-------: | :-----: | :-------------------: |
|
| 121 |
+
| CORD | *91.4* | 77.8 | **96.2** |
|
| 122 |
+
| ICDAR2013 | *92.0* | 85.0 | **95.9** |
|
| 123 |
+
| ICDAR2015 | **73.7** | 57.9 | **73.7** |
|
| 124 |
+
| ***AVERAGE*** | *85.7* | 73.6 | **88.6** |
|
| 125 |
|
| 126 |
## Usage
|
| 127 |
To use this model, we recommend installing `transformers` version **4.53.1 or higher**. While it may work with earlier versions, using **4.53.1 or above is strongly recommended**, especially to ensure optimal performance for the **multi-image feature**.
|