File size: 1,484 Bytes
5f7a1ab
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85


⸻

APJ Threat Intelligence System

Architecture & Operational Design

1. Purpose

A multilingual, culturally aware threat-intelligence platform capable of interpreting Mandarin/Cantonese cybercrime chatter and transforming it into structured intelligence for APJ-focused defensive operations.

⸻

2. Major Components

2.1 Ingest Layer
	•	Marketplace scrapers (read-only)
	•	Telegram/Discord crawlers
	•	Domain & WHOIS monitors
	•	File upload entry via Gradio

2.2 Language Layer
	•	Dialect detection (Mandarin vs Cantonese)
	•	Idiom interpreter
	•	Slang lexicon with auto-expansion

2.3 Intelligence Layer
	•	Threat classification model (Transformers)
	•	Vendor graph builder
	•	Trend engine & anomaly detector
	•	Reputational scoring

2.4 Operator Interface
	•	Mobile-first Gradio chat console
	•	Mode switcher:
	•	Threat Intel
	•	Translation
	•	Marketplace Watch
	•	Analyst Tools
3. Data Structures

Message Object

{
  "raw_text": "...",
  "language": "zh-yue",
  "slang": ["飛數", "黑料"],
  "intent": "selling_stolen_data",
  "risk_score": 4
}

Vendor Node

{
  "handle": "darkcat99",
  "languages": ["zh-CN"],
  "reputation": 0.74,
  "products": ["phishing-kit", "RAT"],
  "last_seen": "2025-01-04"
}




4. Pipeline Flow

Source → Ingest → Language Engine → Threat Classifier → 
Vendor Graph → Analysis → UI




5. Requirements
	•	Python 3.10+
	•	gradio
	•	transformers
	•	datasets
	•	pydantic