katerynaCh commited on
Commit
277fbec
·
verified ·
1 Parent(s): 35af664

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -10
README.md CHANGED
@@ -11,15 +11,12 @@ tags:
11
  - OCR
12
  ---
13
 
14
- # nemotron-parse Overview
15
 
16
- nemotron-parse is a general purpose text-extraction model, specifically designed to handle documents. Given an image, nemotron-parse is able to extract formatted-text, with bounding-boxes and the corresponding semantic class. This has downstream benefits for several tasks such as increasing the availability of training-data for Large Language Models (LLMs), improving the accuracy of retriever systems, and enhancing document understanding pipelines.
17
 
18
  This model is ready for commercial use.
19
 
20
-
21
-
22
-
23
  ## License
24
  GOVERNING TERMS: The NIM container is governed by the [NVIDIA Software License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/) and [Product-Specific Terms for NVIDIA AI Products](https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/). Use of this model is governed by the [NVIDIA Community Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-community-models-license/). Use of the tokenizer included in this model is governed by the [CC-BY-4.0 license](https://creativecommons.org/licenses/by/4.0/).
25
 
@@ -28,8 +25,8 @@ GOVERNING TERMS: The NIM container is governed by the [NVIDIA Software License A
28
  Global
29
 
30
  ## Use Case:
31
- nemotron-parse will be capable of comprehensive text understanding and document structure understanding. It will be used in retriever and curator solutions. Its text extraction datasets and capabilities will help with LLM and VLM training, as well as improve run-time inference accuracy of VLMs.
32
- The nemotron-parse model will perform text extraction from PDF and PPT documents. The nemotron-parse can classify the objects (title, section, caption, index, footnote, lists, tables, bibliography, image) in a given document, and provide bounding boxes with coordinates.
33
 
34
 
35
  ## Release Date:
@@ -73,7 +70,7 @@ Carbon Emissions: 3.21 tCO2e <br>
73
  * Output Format: String
74
  * Output Parameters: 1D
75
  - Other Properties Related to Output:
76
- - nemotron-parse output format is a string which encodes text content (formatted or not) as well as bounding boxes and class attributes.<br>
77
  In the default prompt setting, text content is represented as markdown, and math expressions as LaTeX, enclosed in \[..\] or \(..\). If a mathematical expression does not require LaTeX formatting to be represented (e.g., consisting only of characters and subscripts/superscripts), it is represented as markdown. Tables are represented as LaTeX.
78
  Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.<br>
79
 
@@ -228,7 +225,7 @@ Nemotron-Parse-v1.1 is also available as an [optimized NIM container](https://bu
228
 
229
  ### Training Dataset
230
 
231
- nemotron-parse is first pre-trained on our internal datasets: human, synthetic and automated.
232
  Data Modality:
233
  *Text
234
  *Image<br>
@@ -237,7 +234,7 @@ Labeling Method by Dataset: Hybrid: Human, Synthetic, Automated
237
 
238
  ### Testing and Evaluation Dataset:
239
 
240
- nemotron-parse is evaluated on multiple datasets for robustness, including public and internal dataset.
241
  Data Collection Method by Dataset: Hybrid: Human, Synthetic, Automated
242
  Labeling Method by Dataset: Hybrid: Human, Synthetic, Automated
243
 
 
11
  - OCR
12
  ---
13
 
14
+ # NVIDIA Nemotron Parse 1.1 Overview
15
 
16
+ NVIDIA Nemotron Parse 1.1 is a general purpose text-extraction model, specifically designed to handle documents, and extracting text, table, and understanding document semantics. Given an image, NVIDIA Nemotron Parse 1.1 is able to extract formatted text, bounding-boxes and the corresponding semantic classes. This has downstream benefits for several tasks such as increasing the availability of training-data for Large Language Models (LLMs), improving the accuracy of retriever systems, and enhancing document understanding pipelines.
17
 
18
  This model is ready for commercial use.
19
 
 
 
 
20
  ## License
21
  GOVERNING TERMS: The NIM container is governed by the [NVIDIA Software License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/) and [Product-Specific Terms for NVIDIA AI Products](https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/). Use of this model is governed by the [NVIDIA Community Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-community-models-license/). Use of the tokenizer included in this model is governed by the [CC-BY-4.0 license](https://creativecommons.org/licenses/by/4.0/).
22
 
 
25
  Global
26
 
27
  ## Use Case:
28
+ NVIDIA Nemotron Parse 1.1 will be capable of comprehensive text understanding and document structure understanding. It will be used in retriever and curator solutions. Its text extraction datasets and capabilities will help with LLM and VLM training, as well as improve run-time inference accuracy of VLMs.
29
+ The NVIDIA Nemotron Parse 1.1 model will perform text extraction from PDF and PPT documents. The NVIDIA Nemotron Parse 1.1 can classify the objects (title, section, caption, index, footnote, lists, tables, bibliography, image) in a given document, and provide bounding boxes with coordinates.
30
 
31
 
32
  ## Release Date:
 
70
  * Output Format: String
71
  * Output Parameters: 1D
72
  - Other Properties Related to Output:
73
+ - NVIDIA Nemotron Parse 1.1 output format is a string which encodes text content (formatted or not) as well as bounding boxes and class attributes.<br>
74
  In the default prompt setting, text content is represented as markdown, and math expressions as LaTeX, enclosed in \[..\] or \(..\). If a mathematical expression does not require LaTeX formatting to be represented (e.g., consisting only of characters and subscripts/superscripts), it is represented as markdown. Tables are represented as LaTeX.
75
  Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.<br>
76
 
 
225
 
226
  ### Training Dataset
227
 
228
+ NVIDIA Nemotron Parse 1.1 is first pre-trained on our internal datasets: human, synthetic and automated.
229
  Data Modality:
230
  *Text
231
  *Image<br>
 
234
 
235
  ### Testing and Evaluation Dataset:
236
 
237
+ NVIDIA Nemotron Parse 1.1 is evaluated on multiple datasets for robustness, including public and internal dataset.
238
  Data Collection Method by Dataset: Hybrid: Human, Synthetic, Automated
239
  Labeling Method by Dataset: Hybrid: Human, Synthetic, Automated
240