Model Summary and Training Method

This model (lamm-mit/Graph-Preflexor-8b_12292025) was trained in two sequential stages to produce graph-native scientific reasoning with structured intermediate representations.

How Graph-Preflexor Reasons

The model produces a structured reasoning trace with explicit “sentinel” blocks. Each block has a distinct role: exploration → structure → validation → synthesis.

User Prompt
   |
   v
<think>  (internal reasoning container; not meant as final answer)
   |
   +--> <brainstorm>
   |       Purpose: generate hypotheses, mechanisms, candidate factors,
   |                and possible causal stories (broad search; divergent).
   |
   +--> <graph>
   |       Purpose: sketch the conceptual graph verbally (entities + relations).
   |                Think of it as the draft blueprint.
   |
   +--> <graph_json>
   |       Purpose: emit a machine-readable knowledge graph:
   |                nodes = concepts; edges = relations (source, relation, target).
   |                This is the canonical structured representation.
   |
   +--> <patterns>
   |       Purpose: compress the graph into reusable motifs:
   |                invariants, abstractions, multi-scale regularities,
   |                analogies, and “design rules”.
   |
   +--> <synthesis>
   |       Purpose: assemble the final narrative by reading from the graph:
   |                coherent, ordered explanation and (optionally) next steps.
   |
</think>
   |
   v
Final Answer (post-</think>, user-facing)
   - concise, coherent prose derived from the graph + synthesis
   - should remain consistent with the <graph_json> content

Sentinel blocks used by the model:

<think> ... </think>
  - Container for all internal work.
  - May include intermediate calculations, choices, and planning.

<brainstorm> ... </brainstorm>
  - Rapid hypothesis generation.
  - Lists candidate mechanisms, variables, constraints, tradeoffs.
  - “Wide exploration” mode.

<graph> ... </graph>
  - Human-readable graph sketch.
  - Names the concepts and how they connect (often as bullet edges).

<graph_json> ... </graph_json>
  - Machine-readable knowledge graph:
      {
        "nodes": [{"id": "ConceptA"}, ...],
        "edges": [{"source": "A", "relation": "causes", "target": "B"}, ...]
      }
  - Intended to be parseable and reusable for downstream tooling.

<patterns> ... </patterns>
  - Extracts higher-level structure:
    - causal motifs (feedforward/feedback loops)
    - modularity / hierarchy (micro→meso→macro)
    - bottlenecks, bridges, invariants
    - “principles” that generalize beyond the single example

<synthesis> ... </synthesis>
  - Turns structure into explanation:
    - ordered narrative aligned with the graph
    - explicit causal chain(s)
    - checks for coherence / missing links
    - may propose experiments, predictions, or design implications

Why this format is useful:

Interpretability: reasoning is segmented by function (explore → formalize → abstract → explain).
Parsability: <graph_json> enables programmatic extraction, visua_

Detailed sampling code and CLIs described below. For quick start and interactive demo (including graph visualization), use this Colab notebook:

1. Base Model: ORPO Graph Reasoning

The starting point was an ORPO-trained graph-native language model, which was based off of Qwen/Qwen3-8B.

In the first step, ORPO (Offline Reinforcement Preference Optimization) was used to align the model toward:

Explicit structured reasoning using tagged sections (e.g. <think>, <graph>, <graph_json>, <patterns>, <synthesis>)
Emission of valid, machine-interpretable graph-structrured thinking alongside natural language answers
Faithful separation between internal reasoning, graph construction, and final synthesis

This stage established the model’s representation language and graph-centric reasoning style.

2. GRPO Fine-Tuning with External Judging

On top of the ORPO model, we applied GRPO (Generative Reinforcement Preference Optimization) using the dataset:

Dataset: lamm-mit/graph_reasoning_v3
Prompts: graph-based scientific and reasoning questions
Gold answers: reference solutions used for evaluation

For each prompt, the model generated multiple candidate completions (num_generations = 8). These were scored using an external LLM judge (grok-4-1-fast-non-reasoning) via a multi-component reward function.

Reward Structure

Each completion received a weighted reward composed of:

Answer correctness (0.30)
How well the final answer matches the gold reference.
Format compliance (0.15)
Presence and validity of required reasoning sections and parseable graph JSON.
Graph utility (0.25)
Whether the emitted knowledge graph alone contains enough information to reconstruct the answer.
Graph validity (NetworkX) (0.10)
Structural soundness of the graph (nodes, edges, connectivity).
Graph diversity (0.10)
Semantic diversity of concepts expressed in nodes and relations.
Graph structure quality (0.10)
Topological richness (depth, internal nodes, non-trivial connectivity).

Rewards were normalized per batch and optimized using the GRPO loss.

4. Resulting Capabilities

The resulting model is optimized to:

Reason explicitly through structured, inspectable intermediate representations
Emit valid, analyzable knowledge graphs alongside answers
Encode scientific reasoning such that graphs themselves carry explanatory power
Balance correctness, interpretability, and structural richness

This makes the model particularly suitable for AI-for-science, graph-native reasoning, and knowledge discovery workflows where transparency and structure matter as much as accuracy.

Sample Generation

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

token= 'hf_...'

# ------------------------------------------------------------------------------
# Configuration
# ------------------------------------------------------------------------------
MODEL_NAME = "lamm-mit/Graph-Preflexor-8b_12292025"
PROMPT = "Give me a short introduction to materiomics."
MAX_NEW_TOKENS = 32_768
THINK_END_TOKEN_ID = 151668  # </think>

# ------------------------------------------------------------------------------
# Model & Tokenizer Loading
# ------------------------------------------------------------------------------
tokenizer = AutoTokenizer.from_pretrained(
    MODEL_NAME,
    token=token,
)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    torch_dtype="auto",
    device_map="auto",
    token=token,
)
model.eval()

# ------------------------------------------------------------------------------
# Prompt Construction
# ------------------------------------------------------------------------------
messages = [
    {"role": "user", "content": PROMPT}
]

prompt_text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True,  # toggles chain-of-thought mode
)

model_inputs = tokenizer(
    prompt_text,
    return_tensors="pt",
).to(model.device)

# ------------------------------------------------------------------------------
# Generation
# ------------------------------------------------------------------------------
gen_config = GenerationConfig(
    max_new_tokens=MAX_NEW_TOKENS,
    do_sample=True,  # sample 
    temperature=0.2, 
)

with torch.no_grad():
    generated = model.generate(
        **model_inputs,
        generation_config=gen_config,
    )

# Slice off the prompt tokens
output_ids = generated[0, model_inputs.input_ids.shape[1]:].tolist()

# ------------------------------------------------------------------------------
# Thinking / Content Parsing
# ------------------------------------------------------------------------------
def split_thinking(output_ids, tokenizer, think_end_id):
    """
    Split generated tokens into (thinking, final_content) based on </think>.
    Falls back gracefully if no thinking block is present.
    """
    try:
        split_idx = len(output_ids) - output_ids[::-1].index(think_end_id)
    except ValueError:
        split_idx = 0

    thinking = tokenizer.decode(
        output_ids[:split_idx],
        skip_special_tokens=True,
    ).strip()

    content = tokenizer.decode(
        output_ids[split_idx:],
        skip_special_tokens=True,
    ).strip()

    return thinking, content


thinking, content = split_thinking(
    output_ids,
    tokenizer,
    THINK_END_TOKEN_ID,
)

# ------------------------------------------------------------------------------
# Output
# ------------------------------------------------------------------------------
print("\n" + "=" * 80)
print("THINKING")
print("=" * 80)
print(thinking or "[**no thinking content detected**]")

print("\n" + "=" * 80)
print("FINAL OUTPUT")
print("=" * 80)
print(content)

Sample output:

================================================================================
THINKING
================================================================================
<think>
<brainstorm>
Materiomics is an emerging interdisciplinary field that bridges materials science with omics approaches like genomics, proteomics, and metabolomics. Key concepts: it uses high-throughput, data-driven methods to analyze materials at atomic/molecular scales, enabling predictive design and discovery. Relevant phenomena include self-assembly, phase transitions, defect engineering, and property-structure relationships. Hypotheses: materiomics can accelerate material innovation by mimicking biological systems' efficiency (e.g., protein folding). Interactions: data from omics informs computational models (e.g., machine learning for property prediction), while materials synthesis validates these models. Broader context: part of materials informatics, relevant to nanotechnology, sustainable materials, and AI-driven R&D. Key variables: material composition, processing conditions, scale (nano/micro), omics data quality, computational power.
</brainstorm>

<graph>
Core entities: Materiomics, Materials Science, Omics (Genomics/Proteomics/Metabolomics), High-Throughput Analysis, Predictive Modeling, Material Properties, Self-Assembly, Phase Transitions, Defect Engineering, Computational Models, Data-Driven Design, Biological Inspiration.
Relationships: Materiomics integrates Materials Science with Omics → enables High-Throughput Analysis → feeds Predictive Modeling → predicts Material Properties. Self-Assembly, Phase Transitions, Defect Engineering → influence Material Properties. Computational Models → refine Predictive Modeling. Biological Inspiration → guides Data-Driven Design in Materiomics. High-Throughput Analysis → validates Computational Models.
</graph>

<graph_json>
{
  "nodes": [
    {"id": "Materiomics"},
    {"id": "MaterialsScience"},
    {"id": "Omics"},
    {"id": "HighThroughput"},
    {"id": "PredictiveModeling"},
    {"id": "MaterialProperties"},
    {"id": "SelfAssembly"},
    {"id": "PhaseTransitions"},
    {"id": "DefectEngineering"},
    {"id": "ComputationalModels"},
    {"id": "DataDrivenDesign"},
    {"id": "BiologicalInspiration"}
  ],
  "edges": [
    {"source": "Materiomics", "relation": "integrates", "target": "MaterialsScience"},
    {"source": "Materiomics", "relation": "integrates", "target": "Omics"},
    {"source": "Omics", "relation": "enables", "target": "HighThroughput"},
    {"source": "HighThroughput", "relation": "feeds", "target": "PredictiveModeling"},
    {"source": "PredictiveModeling", "relation": "predicts", "target": "MaterialProperties"},
    {"source": "SelfAssembly", "relation": "influences", "target": "MaterialProperties"},
    {"source": "PhaseTransitions", "relation": "influences", "target": "MaterialProperties"},
    {"source": "DefectEngineering", "relation": "influences", "target": "MaterialProperties"},
    {"source": "ComputationalModels", "relation": "refines", "target": "PredictiveModeling"},
    {"source": "HighThroughput", "relation": "validates", "target": "ComputationalModels"},
    {"source": "BiologicalInspiration", "relation": "guides", "target": "DataDrivenDesign"},
    {"source": "DataDrivenDesign", "relation": "applies", "target": "Materiomics"}
  ]
}
</graph_json>

<patterns>
Materiomics: MaterialsScience + Omics → HighThroughput → PredictiveModeling → MaterialProperties (predictive pipeline).
BiologicalInspiration → DataDrivenDesign → Materiomics (inspiration loop).
</patterns>

<synthesis>
The graph reveals materiomics as a predictive pipeline integrating materials science with omics data, validated by high-throughput analysis and computational models to design materials with desired properties. Key insight: this data-driven approach, inspired by biological systems, accelerates discovery by quantifying structure-property relationships at atomic scales, enabling efficient self-assembly, phase control, and defect engineering. Coherent picture: materiomics transforms materials innovation from trial-and-error to informed, scalable design, leveraging omics for high-dimensional data and AI for modeling.
</synthesis>
</think>

================================================================================
FINAL OUTPUT
================================================================================
Materiomics is an interdisciplinary field that merges materials science with omics methodologies—such as genomics, proteomics, and metabolomics—to systematically analyze, design, and predict the properties of materials at atomic and molecular scales. At its core, materiomics leverages high-throughput experimental techniques and advanced computational models to generate vast datasets on material composition, structure, processing conditions, and resulting properties. These data are then used to build predictive models that can forecast material behavior under various stimuli, enabling the rational design of novel materials with tailored functionalities. Key phenomena underpinning materiomics include self-assembly processes where molecules spontaneously form ordered structures, phase transitions that dictate stability and transformation under thermal or mechanical stress, and defect engineering that manipulates imperfections to enhance properties like strength or conductivity. By drawing inspiration from biological systems—where complex materials like proteins and cell membranes emerge from simple building blocks—materiomics adopts data-driven, systems-level approaches to accelerate discovery. This field is pivotal in advancing nanotechnology, sustainable materials, and AI-driven R&D, offering a scalable framework to move beyond traditional trial-and-error methods, thereby revolutionizing industries from electronics to energy storage.

Sample Generation CLI

python graph_reasoning.py \
    --model lamm-mit/Graph-Preflexor-8b_12292025 \
    --prompt "Explain dragline silk toughness."

References and Citation

This model was trained based on the ideas presented in the below referenced papers.

@article{Buehler2025PRefLexOR,
  author       = {Buehler, Markus J.},
  title        = {PRefLexOR: preference-based recursive language modeling for exploratory optimization of reasoning and agentic thinking},
  journal      = {npj Artificial Intelligence},
  volume       = {1},
  number       = {4},
  year         = {2025},
  publisher    = {Springer Nature},
  doi          = {10.1038/s44387-025-00003-z},
  url          = {https://doi.org/10.1038/s44387-025-00003-z},
  issn         = {2731-990X},
  received     = {2024-11-01},
  accepted     = {2025-03-22},
  published    = {2025-05-14},
  keywords     = {Complex networks, Computational biology and bioinformatics}
}

@article{Buehler2025GraphPRefLexOR,
  author       = {Buehler, Markus J.},
  title        = {In Situ Graph Reasoning and Knowledge Expansion Using Graph-PRefLexOR},
  journal      = {Advanced Intelligent Discovery},
  year         = {2025},
  publisher    = {Wiley},
  doi          = {10.1002/aidi.202500006},
  url          = {https://doi.org/10.1002/aidi.202500006},
  note         = {Research Article, Open Access},
  published    = {2025-06-09}
}

Downloads last month: 126

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for lamm-mit/Graph-Preflexor-8b_12292025

Quantizations

2 models