Remyx AI

Team

company

Verified

https://remyx.ai

remyxai

remyxai

Activity Feed Request to join this org

AI & ML interests

Machine learning, deep learning, generative AI, LLMs

Recent Activity

salma-remyx updated a model 24 days ago

remyxai/SpaceQwen3-VL-2B-Thinking

salma-remyx updated a collection 24 days ago

salma-remyx new activity 3 months ago

remyxai/SpaceQwen2.5-VL-3B-Instruct:Add link to paper, project page and Github repo

View all activity

salma-remyx

updated a model 24 days ago

remyxai/SpaceQwen3-VL-2B-Thinking

Image-Text-to-Text • 2B • Updated 24 days ago • 58 • 2

salma-remyx

updated a collection 24 days ago

SpaceThinker

Test Time Compute for Quantitative Spatial Reasoning using synthetic reasoning traces from 3D scene graphs • 7 items • Updated 24 days ago • 2

salma-remyx

posted an update 24 days ago

Post

3292

We've built over 10K containerized reproductions of papers from arXiv!

Instead of spending all day trying to build an environment to test that new idea, just pull the Docker container from the Remyx registry.

And with Remyx, you can start experimenting faster by generating a test PR in your codebase based on the ideas found in your paper of choice.

Hub: https://hub.docker.com/u/remyxai
Remyx docs: https://docs.remyx.ai/resources/ideate
Coming soon, explore reproduced papers with AG2 + Remyx: https://github.com/ag2ai/ag2/pull/2141

1 reply

·

salma-remyx

posted an update about 1 month ago

Post

1027

The future is arriving too fast not to use programmatic discovery and replication.
Search arXiv → Execute in 30 seconds with pre-built Docker environments

Check out our latest integration with AG2 to accelerate your discovery loop.
As easy as:

from remyxai.client.search import SearchClient
from autogen.coding import RemyxCodeExecutor

# Search by topic
papers = SearchClient().search(
    "data synthesis strategies",
    has_docker=True,  # Only papers with pre-built environments
    limit=10
)

executor = RemyxCodeExecutor(arxiv_id=papers[0].arxiv_id)

remyx_executor.explore(
    goal="Run a test with my model remyxai/SpaceThinker-Qwen2.5VL-3B",
    interactive=False  # Automated exploration
)

Tutorial: https://github.com/ag2ai/ag2/blob/4c6954e3959fe672980191f264e30d451bc23554/notebook/agentchat_remyx_executor.ipynb
PR: https://github.com/ag2ai/ag2/pull/2141

salma-remyx

posted an update about 2 months ago

Post

3717

Thanks again to @ag2 for hosting us at their Community Talks!
@terry-remyx walked us through a technical deep dive into GitRank, our automated pipeline that converts research papers with code into containerized, executable environments and generates specialized tests tailored to users' specific codebases.

In case you missed it...
Full recording: https://www.youtube.com/watch?v=N_FNfZ71s2I
Deck: https://docs.google.com/presentation/d/1S0q-wGCu2dliVWb9ykGKFz61jZKZI4ipxWBv73HOFBo/edit?usp=sharing

salma-remyx

posted an update about 2 months ago

Post

2925

We're joining the @ag2 team in discord to present a deep-dive into how we've used the framework to build GitRank in their Community Talks

The GitRank pipeline is used to:
📰 power personalized paper recommendations
🐳 build environments as Docker Images
🎯 implement core-methods as PRs for your target repo

Don't miss it! Tomorrow, Sept 25 at 9:00 am PST: https://calendar.app.google/3soCpuHupRr96UaF8

salma-remyx

posted an update about 2 months ago

Post

1505

We've added intelligent full-text search across our pre-built Docker images for arXiv papers with ready-to-run code and papers straight from arXiv.

Natural language queries.
Semantic understanding.
One search to find both the paper AND the runnable code.

Try it today: https://engine.remyx.ai/resources/
Join us at Experiment 2025: https://experiment.remyx.ai

salma-remyx

posted an update about 2 months ago

Post

5352

Rolling Benchmarks - Evaluating AI Agents on Unseen GitHub Repos

Static benchmarks are prone to leaderboard hacking and training data contamination, so how about a dynamic/rolling benchmark?

By limiting submissions to only freshly published code, we could evaluate based on consistency over time with rolling averages instead of finding agents overfit to a static benchmark.

Can rolling benchmarks bring us closer to evaluating agents in a way more closely aligned with their real-world applications? Perhaps a new direction for agent evaluation?

Would love to hear what you think about this!
More on reddit: https://www.reddit.com/r/LocalLLaMA/comments/1nmvw7a/rolling_benchmarks_evaluating_ai_agents_on_unseen/

salma-remyx

posted an update about 2 months ago

Post

3988

Trustworthy AI evals has been an industry challenge for the last few years, so what's missing?
Causal Reasoning.

Model based eval frameworks can't tell you if your changes actually improved user outcomes - you need to take a systems level approach.

At Remyx, we’re building the intelligence layer for AI experimentation. Check out this example on how we start laying the scaffolding to launch controlled experiments to turn your hypotheses into insights on what drives performance for your application.

Check out the latest at Remyx in our docs: https://docs.remyx.ai
Try your first experiment today! https://engine.remyx.ai

salma-remyx

posted an update about 2 months ago

Post

3232

Mark you calendars for Thursday Sept 25th at 9am PST 📆
We're joining the @ag2 team in discord to present a deep-dive into how we've used the framework to build GitRank in their Community Talks

The GitRank pipeline is used to:
📰 power personalized paper recommendations
🐳 build environments as Docker Images
🎯 implement core-methods as PRs for your target repo

Attached is a draft outlining what we plan to cover in the talk.
Would love to gather your feedback to make this insightful for all: https://docs.google.com/presentation/d/1S0q-wGCu2dliVWb9ykGKFz61jZKZI4ipxWBv73HOFBo/edit?usp=sharing

salma-remyx

posted an update about 2 months ago

Post

3242

Reproducing research code shouldn't take longer than reading the paper.
For papers that include code, setting up the right environment often means hours of dependency hell and configuration debugging.

At Remyx AI, we built an agent that automatically creates and tests Docker images for research papers, then shares them publicly so anyone can reproduce results with a single command.

We just submitted PR #908 to integrate this directly into arXiv Labs.

If you believe in making reproducible research accessible to everyone, give it a bump!: https://github.com/arXiv/arxiv-browse/pull/908

3 replies

·

salma-remyx

posted an update 2 months ago

Post

2538

Search is such a fundamental part of content discovery, yet ends up overlooked or poorly implemented in so many apps we use every day.

We built hundreds of Docker images for arXiv papers with a codebase - it's tough to find what you're looking for unless you happen to have the arXiv id handy using DockerHub's search.

So we added full text search over these resources so that you're that much closer to testing a new promising idea. More resources to be indexed soon!

Full Demo: https://www.youtube.com/watch?v=GjYReWbQZw8
Try it here!: https://engine.remyx.ai/resources
Join us at Experiment 2025!: https://experiment.remyx.ai

salma-remyx

posted an update 2 months ago

Post

4051

Most apps don't have great full-text search over their assets.

We've developed an agent to automate the environment building and testing of experimental codebases sourced from arXiv. We push these containerized reproductions daily to Docker Hub: https://hub.docker.com/u/remyxai

However, searching for them can be challenging unless you know the specific arXiv ID associated with each paper.

We are currently working on implementing a search feature in Remyx, which will make these assets easily discoverable and ready for testing 🔍 Stay tuned!

Discover your next best idea to experiment with here: https://engine.remyx.ai

salma-remyx

posted an update 2 months ago

Post

3991

Science is the vibe-killer

Some critique on the state of the technology
Presenting an alternative vision for scaling the scientific method in AI engineering

https://remyxai.substack.com/p/vibes-dont-scale

2 replies

·

salma-remyx

posted an update 2 months ago

Post

6336

The docs for GitRank are live! Follow along to see how you can:

📖 Daily personalized papers from arXiv matching your project context
👩‍💻 One-click PRs with complete implementation, tests, and docs
🚀 Parallel experimentation - test multiple ideas with ease

Your next great idea is probably in a paper you haven't had time to implement.

Try it today! http://docs.remyx.ai/resources/ideate

salma-remyx

posted an update 2 months ago

Post

3594

GitRank

We built an agent to surface and implement high-potential ideas for your repo, asynchronously generating containers, tests, and PRs so you can evaluate what works and double down on it.

Check out the demo: https://youtu.be/frgPsTclc1k

Come replicate and specialize a test for your repo! GitRank is live on Remyx.
Docs: https://docs.remyx.ai
App: https://engine.remyx.ai
Example PR here: https://github.com/smellslikeml/experimental-vqasynth/pull/727

salma-remyx

posted an update 3 months ago

Post

2785

Are you coming to SF this Fall?

Next week, we'll be at the AI Agent Builders Summit.
And in late October, GitHub Universe, ODSC West, and Experiment 2025.

We're sharing what we've learned while building agents to help you test new research ideas out of the arXiv into PRs for your repo.

This Summer, we've analyzed thousands of papers, ranking each for relevance to our work before building hundreds of Docker images and opening hundreds of PRs for our repos.

Read more about PapersWithPRs: https://www.reddit.com/r/LocalLLaMA/comments/1mq7715/paperswithprs_dont_just_read_the_paper_replicate/

𝗔𝗜 𝗔𝗴𝗲𝗻𝘁 𝗕𝘂𝗶𝗹𝗱𝗲𝗿𝘀 𝗦𝘂𝗺𝗺𝗶𝘁: https://luma.com/agents-world-tour-sf
𝗚𝗶𝘁𝗛𝘂𝗯 𝗨𝗻𝗶𝘃𝗲𝗿𝘀𝗲: https://githubuniverse.com/
DISCOUNT CODE: TAKEMETOUNIVERSE
𝗘𝘅𝗽𝗲𝗿𝗶𝗺𝗲𝗻𝘁 𝟮𝟬𝟮𝟱: https://luma.com/145xyuyw

salma-remyx

in remyxai/SpaceQwen2.5-VL-3B-Instruct 3 months ago

Add link to paper, project page and Github repo

#3 opened 5 months ago by

salma-remyx

posted an update 3 months ago

Post

2604

𝗣𝗮𝗽𝗲𝗿𝟮𝗣𝗥𝘀
Lately, we've been experimenting with recommending arXiv papers based on the context of what we're building in AI.
At the same time, we're using an agent to help automate the building and testing of Docker Images.

Check out the example here:
https://hub.docker.com/repository/docker/remyxai/2507.20613v1/general

Next, we're tasking our #ExperimentOps agent to open PRs in a target repo, to evaluate the core concepts from a new research paper in the context of your application and your kpis.

Operationalize your Experimentation!
Find Your Frontier!
#BeAnExperimenter

1 reply

·

salma-remyx

in remyxai/SpaceThinker-Qwen2.5VL-3B 4 months ago

conversion to int4 gpqt / awq ?

#4 opened 4 months ago by