GLM-4.7-Derestricted-V3

This is a mildly derestricted version of GLM-4.7, created using norm-preserving biprojected abliteration (based on Jim Lai’s technique).

It is not uncensored, but it is significantly more steerable than stock GLM-4.7 and generates less "over-aligned" prose—making it suitable for creative control vector research.


Why I Made This

I originally tried to train Compassion_vs_Sadism control vectors on GLM-4.7, but the model’s early, rigid refusals kept interfering—effectively turning the axis into "Compliance vs. Refusal" instead of a moral trait. The refusal signal in GLM-4.7 peaks late and is overly dominant, drowning out subtler behavioral directions.

After applying targeted abliteration, the model behaves much more like GLM-4.6 or Kimi-K2-Instruct:

  • Control vector responses now show meaningful variation in tone and intent
  • The Compassion_vs_Sadism axis peaks in mid-layers, as expected
  • Refusals no longer hijack the latent direction

Best of all: control vectors trained on this derestricted model also work on stock GLM-4.7, suggesting the intervention removed noise without breaking core alignment.

image

Compassion_vs_Sadism control vector activation by layer (stock GLM-4.7)
Notice the abnormal early-layer spike (Layer 32) -> this is refusal interference.

image

Compassion_vs_Sadism control vector activation by layer (Derestricted GLM-4.7)
Peak activation now occurs in mid-layers (~Layer 40–50), as expected for a behavioral trait.

The peaks closer to the middle look a lot more like GLM-4.6 and Kimi-K2-Instruct.


Use Case

This model is intended only for research—specifically:

  • Training and probing behavioral control vectors (e.g., Dark Tetrad traits)
  • Studying the interaction between refusal circuits and steering directions
  • Comparing alignment architectures across models (GLM-4.6 vs 4.7 vs Kimi-K2)

Limitations & Warnings

  • Not a general-purpose chat model—it may underperform on tool use, factual QA, or safety-critical tasks.
  • Not fully uncensored—it still refuses clearly harmful requests; it just doesn’t over-refuse creative or ambiguous ones.
  • ⚠️ Norm preservation reduces—but doesn’t eliminate—capability degradation. Stick to text generation tasks.
  • 🔬 Do not deploy in production. This is a research artifact.

I’m also preparing a Kimi-K2-Thinking-Derestricted version using the same philosophy, since it suffered from a similar (though later-stage) refusal interference issue when trying to train control-vectors for it.

Downloads last month
43
GGUF
Model size
358B params
Architecture
glm4moe
Hardware compatibility
Log In to view the estimation

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for gghfez/GLM-4.7-Derestricted-v3-Q3KL-GGUF

Base model

zai-org/GLM-4.7
Quantized
(30)
this model