Featherless Serverless LLM

Team

company

https://featherless.ai

featherlessai

Activity Feed

Inference Provider

VERIFIED

256,855 monthly requests

AI & ML interests

making models accessible

Recent Activity

SmerkyG updated a model 22 days ago

featherless-ai/QRWKV-72B

KaraKaraWitch new activity about 1 month ago

featherless-ai/try-this-model:Hihi9

SmerkyG updated a model about 2 months ago

featherless-ai/QRWKV-QwQ-32B

View all activity

Articles

Featherless AI on Hugging Face Inference Providers 🔥

Jun 12

• 48

SmerkyG

updated a model 22 days ago

featherless-ai/QRWKV-72B

Text Generation • 79B • Updated 22 days ago • 1.66k • • 63

KaraKaraWitch

in featherless-ai/try-this-model about 1 month ago

Hihi9

#15 opened about 1 month ago by

pznhi

SmerkyG

updated a model about 2 months ago

featherless-ai/QRWKV-QwQ-32B

Text Generation • 35B • Updated Sep 17 • 41 • 31

KaraKaraWitch

in featherless-ai/QRWKV-QwQ-32B 4 months ago

Improve model card: Add paper/technical tags, fix GitHub link, include abstract & citation

#4 opened 4 months ago by

nielsr

KaraKaraWitch

in featherless-ai/QRWKV-72B 4 months ago

Add pipeline tag and GitHub link

#5 opened 4 months ago by

nielsr

KaraKaraWitch

updated a model 4 months ago

featherless-ai/QRWKV-72B

Text Generation • 79B • Updated 22 days ago • 1.66k • • 63

KaraKaraWitch

posted an update 4 months ago

Post

443

What if LLMs used thinking emojis to develop their state?

:blob_think: Normal Thinking
:thinkies: Casual Thinking
:Thonk: Serious Thinking
:think_bold: Critical Thinking
:thinkspin: Research Thinking
:thinkgod: Deep Research Thinking

The last 2 are gifs. But the upload doesn't render them :)

(Credits: SwayStar123 on EAI suggested it to be a range selector, Original base idea was from me)

1 reply

KaraKaraWitch

updated a model 4 months ago

featherless-ai/QRWKV-QwQ-32B

Text Generation • 35B • Updated Sep 17 • 41 • 31

wxgeorge

in featherless-ai/QRWKV-QwQ-32B 4 months ago

QRWKV in, QWERKY out

#3 opened 4 months ago by

wxgeorge

in featherless-ai/QRWKV-72B 4 months ago

QRWKV in, Qwerky out

#4 opened 4 months ago by

wxgeorge

DarinVerheijke

updated a Space 5 months ago

README

🪶

wxgeorge

updated a Space 5 months ago

README

🪶

KaraKaraWitch

in featherless-ai/QRWKV-72B 5 months ago

Add link to paper

#3 opened 5 months ago by

nielsr

Add pipeline tag

#1 opened 5 months ago by

nielsr

KaraKaraWitch

in featherless-ai/QRWKV-QwQ-32B 5 months ago

Add pipeline tag and link to paper and GitHub repository

#2 opened 5 months ago by

nielsr

KaraKaraWitch

posted an update 5 months ago

Post

339

"What's wrong with using huggingface transformers?"

Here's a quick example. Am I supposed to be going in with the full knowledge of the inner workings of a LLM model?

import pathlib
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("<ModernBERT>")
# Triton is **required**, but no where in the documentation is specified that triton is needed.
# Installing triton in windows isn't super straightforward. Thankfully someone has already built wheels for it.
#  - https://github.com/woct0rdho/triton-windows/releases

model = AutoModelForSequenceClassification.from_pretrained(
    "<ModernBERT>",  # reference_compile=False
)
# By default it uses CPU. Which is slow. Move to a cuda device.
# This will actually error out if you use "gpu" instead.
model = model.to("cuda")


with torch.no_grad():
    # Not setting `return_tensors="pt"` causes
    #   File "C:\Program Files\Python310\lib\site-packages\transformers\modeling_utils.py", line 5311, in warn_if_padding_and_no_attention_mask
    #     if self.config.pad_token_id in input_ids[:, [-1, 0]]:
    #   TypeError: list indices must be integers or slices, not tuple
    # or...
    #  File "C:\Program Files\Python310\lib\site-packages\transformers\models\modernbert\modeling_modernbert.py", line 836, in forward
    #    batch_size, seq_len = input_ids.shape[:2]
    #  AttributeError: 'list' object has no attribute 'shape'
    block = tokenizer(
        pathlib.Path("test-fic.txt").read_text("utf-8"), return_tensors="pt"
    )
    block = block.to("cuda")
    # **block is needed to fix "AttributeError: 'NoneType' object has no attribute 'unsqueeze'" on attention_mask.unsqueeze(-1)
    logits = model(**block).logits

# Not moving to cpu will cause the sigmoid/softmax ops to fail.
logits = logits.to("cpu")
# print(logits)
predicted_class_ids = torch.softmax(logits, -1)[
    0
].numpy()