NCSOFT
/

VARCO-VISION-2.0-1.7B

@@ -96,30 +96,32 @@ Please note that for certain benchmarks involving LLM-based evaluation (e.g., LL
 | InfoVQA_TEST    | 66.9         | **71.7** | 65.0                  |
 | ***AVERAGE***   | *70.5*       | **71.7** | 68.3                  |
-### Cultural Benchmark
 | Benchmark        | InternVL3-2B | Ovis2-2B | VARCO-VISION-2.0-1.7B |
 | :--------------: | :----------: | :------: | :-------------------: |
 | K-Viscuit        | *60.0*       | **64.1** | 57.7                  |
 | PangeaBench (ko) | **66.2**     | 63.1     | *63.8*                |
-| PangeaBench      | *58.4*       | **59.2** | 56.3                  |
-### Text-only Benchmark
-| Benchmark  | InternVL3-2B | Ovis2-2B | VARCO-VISION-2.0-1.7B |
-| :--------: | :----------: | :------: | :-------------------: |
-| MMLU       | **59.9**     | 12.9     | *55.3*                |
-| MT-Bench   | *6.28*       | 6.14     | **7.23**              |
-| KMMLU      | **38.0**     | *31.1*   | 10.4                  |
-| KoMT-Bench | 2.91         | *3.44*   | **5.91**              |
-| LogicKor   | 2.56         | *3.12*   | **5.37**              |
-> **Note:** Some models show unusually low performance on the MMLU benchmark. This is primarily due to their failure to correctly follow the expected output format when only few-shot exemplars are provided in the prompts. Please take this into consideration when interpreting the results.
 ### OCR Benchmark
-| Benchmark | PaddleOCR | EasyOCR | VARCO-VISION-2.0-1.7B |
-| :-------: | :-------: | :-----: | :-------------------: |
-| CORD      | *91.4*    | 77.8    | **96.2**              |
-| ICDAR2013 | *92.0*    | 85.0    | **95.9**              |
-| ICDAR2015 | **73.7**  | 57.9    | **73.7**              |
 ## Usage
 To use this model, we recommend installing `transformers` version **4.53.1 or higher**. While it may work with earlier versions, using **4.53.1 or above is strongly recommended**, especially to ensure optimal performance for the **multi-image feature**.

 | InfoVQA_TEST    | 66.9         | **71.7** | 65.0                  |
 | ***AVERAGE***   | *70.5*       | **71.7** | 68.3                  |
+### Text-only Benchmark
+| Benchmark       | InternVL3-2B | Ovis2-2B | VARCO-VISION-2.0-1.7B |
+| :-------------: | :----------: | :------: | :-------------------: |
+| MMLU            | **59.9**     | 12.9     | *55.3*                |
+| MT-Bench        | *62.8*       | 61.4     | **72.3**              |
+| KMMLU           | **38.0**     | *31.1*   | 10.4                  |
+| KoMT-Bench      | 29.1         | *34.4*   | **59.1**              |
+| LogicKor        | 25.6         | *31.2*   | **53.7**              |
+| ***AVERAGE***   | *43.1*       | 34.2     | **50.2**              |
+> **Note:** Some models show unusually low performance on the MMLU benchmark. This is primarily due to their failure to correctly follow the expected output format when only few-shot exemplars are provided in the prompts. Please take this into consideration when interpreting the results.
+### Korean Cultural Benchmark
 | Benchmark        | InternVL3-2B | Ovis2-2B | VARCO-VISION-2.0-1.7B |
 | :--------------: | :----------: | :------: | :-------------------: |
 | K-Viscuit        | *60.0*       | **64.1** | 57.7                  |
 | PangeaBench (ko) | **66.2**     | 63.1     | *63.8*                |
+| ***AVERAGE***    | *63.1*       | **63.6** | 60.8                  |
 ### OCR Benchmark
+| Benchmark     | PaddleOCR | EasyOCR | VARCO-VISION-2.0-1.7B |
+| :-----------: | :-------: | :-----: | :-------------------: |
+| CORD          | *91.4*    | 77.8    | **96.2**              |
+| ICDAR2013     | *92.0*    | 85.0    | **95.9**              |
+| ICDAR2015     | **73.7**  | 57.9    | **73.7**              |
+| ***AVERAGE*** | *85.7*    | 73.6    | **88.6**              |
 ## Usage
 To use this model, we recommend installing `transformers` version **4.53.1 or higher**. While it may work with earlier versions, using **4.53.1 or above is strongly recommended**, especially to ensure optimal performance for the **multi-image feature**.