4 23 8

Paul Teiletche

paultltc

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

The Collaboration Gap

upvoted an article 5 days ago

ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases

liked a Space 10 days ago

HuggingFaceTB/smol-training-playbook

View all activity

Organizations

upvoted a paper 5 days ago

The Collaboration Gap

Paper • 2511.02687 • Published 6 days ago • 20

upvoted an article 5 days ago

Article

ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases

and 4 others •

5 days ago

• 35

upvoted an article 19 days ago

Article

Sentence Transformers is joining Hugging Face!

19 days ago

• 74

upvoted an article 28 days ago

Article

Introducing RTEB: A New Standard for Retrieval Evaluation

Oct 1

• 123

upvoted a paper about 1 month ago

ModernVBERT: Towards Smaller Visual Document Retrievers

Paper • 2510.01149 • Published Oct 1 • 30

upvoted 4 articles 4 months ago

Article

Finally, a Replacement for BERT: Introducing ModernBERT

Dec 19, 2024

• 705

Article

Ettin Suite: SoTA Paired Encoders and Decoders

Jul 16

• 74

Article

Introducing ColQwen-Omni: Retrieve in every modality

and 4 others •

Jul 17

• 75

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

• 717

upvoted a paper 4 months ago

Should We Still Pretrain Encoders with Masked Language Modeling?

Paper • 2507.00994 • Published Jul 1 • 78

upvoted a collection 4 months ago

ERNIE 4.5

Collection

collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 26 items • Updated Sep 24 • 174

upvoted an article 8 months ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

•

Oct 7, 2024

• 58

upvoted a paper 8 months ago

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14 • 117

upvoted 5 articles 8 months ago

Article

DeepSearch Using Visual RAG in Agentic Frameworks 🔎

and 1 other •

Mar 21

• 37

Article

ViDoRe Benchmark V2: Raising the Bar for Visual Retrieval

and 2 others •

Mar 18

• 12

Article

SmolVLM Grows Smaller – Introducing the 256M & 500M Models!

Jan 23

• 186

Article

SmolVLM - small yet mighty Vision Language Model

Nov 26, 2024

• 381

Article

Introducing smolagents: simple agents that write actions in code.

Dec 31, 2024

• 1.14k

upvoted a paper about 1 year ago

RegMix: Data Mixture as Regression for Language Model Pre-training

Paper • 2407.01492 • Published Jul 1, 2024 • 40

upvoted a collection about 1 year ago

Parallel Sentences Datasets

Collection

These datasets all have "english" and "non_english" columns for numerous datasets. They can be used to make embedding models multilingual. • 14 items • Updated Feb 25 • 19

Paul Teiletche

AI & ML interests

Recent Activity

Organizations

paultltc's activity

ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases

Sentence Transformers is joining Hugging Face!

Introducing RTEB: A New Standard for Retrieval Evaluation

Finally, a Replacement for BERT: Introducing ModernBERT

Ettin Suite: SoTA Paired Encoders and Decoders

Introducing ColQwen-Omni: Retrieve in every modality

SmolLM3: smol, multilingual, long-context reasoner

Efficient LLM Pretraining: Packed Sequences and Masked Attention

DeepSearch Using Visual RAG in Agentic Frameworks 🔎

ViDoRe Benchmark V2: Raising the Bar for Visual Retrieval

SmolVLM Grows Smaller – Introducing the 256M & 500M Models!

SmolVLM - small yet mighty Vision Language Model

Introducing smolagents: simple agents that write actions in code.