Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models Paper • 2506.05176 • Published Jun 5 • 74
CRAWLDoc: A Dataset for Robust Ranking of Bibliographic Documents Paper • 2506.03822 • Published Jun 4 • 2
CRAWLDoc: A Dataset for Robust Ranking of Bibliographic Documents Paper • 2506.03822 • Published Jun 4 • 2
CRAWLDoc: A Dataset for Robust Ranking of Bibliographic Documents Paper • 2506.03822 • Published Jun 4 • 2 • 2
GenCodeSearchNet: A Benchmark Test Suite for Evaluating Generalization in Programming Language Understanding Paper • 2311.09707 • Published Nov 16, 2023
HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems Paper • 2411.02959 • Published Nov 5, 2024 • 70
DocGraphLM: Documental Graph Language Model for Information Extraction Paper • 2401.02823 • Published Jan 5, 2024 • 36
Transformers are Short Text Classifiers: A Study of Inductive Short Text Classifiers on Benchmarks and Real-world Datasets Paper • 2211.16878 • Published Nov 30, 2022