ghostai1 commited on
Commit
b8bcb6d
Β·
verified Β·
1 Parent(s): 5e2c5d8

Upload 6 files

Browse files
public/example_page.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!-- docs/example_page.md -->
2
+ # GhostAI Music Generator β€” Quick Links
3
+
4
+ - **MusicGen Large (Meta):** https://huggingface.co/facebook/musicgen-large
5
+ - **GhostAI assets & scripts:**
6
+ - Repo hub: https://huggingface.co/ghostai1/GHOSTSONAFB
7
+ - Stable 12GB build (example): https://huggingface.co/ghostai1/GHOSTSONAFB/blob/main/STABLE12gb3060.py
8
+ - 30s large script: https://huggingface.co/ghostai1/GHOSTSONAFB/blob/main/stable12gblg30sec.py
9
+
10
+ ## Notes
11
+ - GPU: CUDA-capable, 12GB+ VRAM recommended.
12
+ - The app exposes an API on `:8555`:
13
+ - `GET /genres` β€” list available presets from `prompts.ini`
14
+ - `GET /prompt/{name}` β€” generate a prompt string (query params: `bpm`, `drum_beat`, `synthesizer`, `rhythmic_steps`, `bass_style`, `guitar_style`)
15
+ - **Aliases from INI** (examples):
16
+ - `/set_classic_rock_prompt` β†’ Metallica
17
+ - `/set_nirvana_grunge_prompt` β†’ Nirvana
18
+ - `/set_pearl_jam_grunge_prompt` β†’ Pearl Jam
19
+ - `/set_soundgarden_grunge_prompt` β†’ Soundgarden
20
+ - `/set_foo_fighters_prompt` β†’ Foo Fighters
21
+ - `/set_star_wars_prompt` β†’ Cinematic Star Wars-style orchestral
22
+ - `POST /render` β€” render an MP3. Body includes `instrumental_prompt` and optional overrides (duration, temperature, etc.).
public/prompts.ini ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # prompts.ini
2
+ # Centralized prompt knobs for buttons + API aliases
3
+ # Add/adjust sections; the app auto-loads buttons and endpoints.
4
+
5
+ [metallica]
6
+ bpm_min=90
7
+ bpm_max=140
8
+ drum_beat=standard rock,techno kick
9
+ synthesizer=none
10
+ rhythmic_steps=steady steps,complex steps
11
+ bass_style=deep bass,melodic bass
12
+ guitar_style=distorted
13
+ api_name=/set_classic_rock_prompt
14
+
15
+ [nirvana]
16
+ bpm_min=100
17
+ bpm_max=130
18
+ drum_beat=standard rock
19
+ synthesizer=none
20
+ rhythmic_steps=steady steps
21
+ bass_style=deep bass
22
+ guitar_style=distorted,clean
23
+ api_name=/set_nirvana_grunge_prompt
24
+
25
+ [pearl_jam]
26
+ bpm_min=100
27
+ bpm_max=140
28
+ drum_beat=standard rock
29
+ synthesizer=none
30
+ rhythmic_steps=steady steps,syncopated steps
31
+ bass_style=melodic bass
32
+ guitar_style=clean,distorted
33
+ api_name=/set_pearl_jam_grunge_prompt
34
+
35
+ [soundgarden]
36
+ bpm_min=90
37
+ bpm_max=130
38
+ drum_beat=standard rock
39
+ synthesizer=none
40
+ rhythmic_steps=complex steps
41
+ bass_style=deep bass
42
+ guitar_style=distorted
43
+ api_name=/set_soundgarden_grunge_prompt
44
+
45
+ [foo_fighters]
46
+ bpm_min=110
47
+ bpm_max=150
48
+ drum_beat=standard rock
49
+ synthesizer=none
50
+ rhythmic_steps=steady steps
51
+ bass_style=melodic bass
52
+ guitar_style=distorted,clean
53
+ api_name=/set_foo_fighters_prompt
54
+
55
+ # New: Cinematic / Star Wars-inspired classical
56
+ # Optional 'styles' enhances descriptive tags for orchestral color.
57
+ [star_wars_classical]
58
+ bpm_min=84
59
+ bpm_max=126
60
+ drum_beat=orchestral percussion
61
+ synthesizer=none
62
+ rhythmic_steps=steady steps,complex steps
63
+ bass_style=contrabass ostinato
64
+ guitar_style=none
65
+ styles=heroic brass,sweeping strings,soaring horns,timpani rolls,choir pads
66
+ api_name=/set_star_wars_prompt
public/publicapi.py ADDED
@@ -0,0 +1,988 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # app.py
2
+ #!/usr/bin/env python3
3
+ # -*- coding: utf-8 -*-
4
+
5
+ import os
6
+ import sys
7
+ import gc
8
+ import re
9
+ import json
10
+ import time
11
+ import math
12
+ import mmap
13
+ import torch
14
+ import random
15
+ import logging
16
+ import warnings
17
+ import traceback
18
+ import subprocess
19
+ import tempfile
20
+ import numpy as np
21
+ import torchaudio
22
+ import gradio as gr
23
+ import gradio_client.utils
24
+ import configparser
25
+ from pydub import AudioSegment
26
+ from datetime import datetime
27
+ from pathlib import Path
28
+ from typing import Optional, Tuple, Dict, Any, List
29
+ from torch.cuda.amp import autocast
30
+
31
+ from fastapi import FastAPI, HTTPException, Query
32
+ from fastapi.middleware.cors import CORSMiddleware
33
+ from pydantic import BaseModel
34
+ import uvicorn
35
+ import threading
36
+ from logging.handlers import RotatingFileHandler
37
+
38
+ # ======================================================================================
39
+ # RUNTIME, LOGGING, PATCHES
40
+ # ======================================================================================
41
+
42
+ _original_get_type = gradio_client.utils.get_type
43
+ def _patched_get_type(schema):
44
+ if isinstance(schema, bool):
45
+ return "boolean"
46
+ return _original_get_type(schema)
47
+ gradio_client.utils.get_type = _patched_get_type
48
+
49
+ warnings.filterwarnings("ignore")
50
+ os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128"
51
+ torch.backends.cudnn.benchmark = False
52
+ torch.backends.cudnn.deterministic = True
53
+
54
+ LOG_DIR = "logs"
55
+ MP3_DIR = "mp3"
56
+ os.makedirs(LOG_DIR, exist_ok=True)
57
+ os.makedirs(MP3_DIR, exist_ok=True)
58
+
59
+ LOG_FILE = os.path.join(LOG_DIR, "ghostai_musicgen.log")
60
+ logger = logging.getLogger("ghostai-musicgen")
61
+ logger.setLevel(logging.DEBUG)
62
+ logger.handlers = [] # prevent duplicate handlers on hot-reload
63
+
64
+ file_handler = RotatingFileHandler(
65
+ LOG_FILE,
66
+ maxBytes=5 * 1024 * 1024, # 5 MB cap
67
+ backupCount=0, # single file only; truncate on rollover
68
+ encoding="utf-8",
69
+ delay=True
70
+ )
71
+ file_handler.setFormatter(logging.Formatter("%(asctime)s [%(levelname)s] %(message)s"))
72
+ stdout_handler = logging.StreamHandler(sys.stdout)
73
+ stdout_handler.setFormatter(logging.Formatter("%(asctime)s [%(levelname)s] %(message)s"))
74
+ logger.addHandler(file_handler)
75
+ logger.addHandler(stdout_handler)
76
+
77
+ DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
78
+ if DEVICE != "cuda":
79
+ logger.error("CUDA GPU is required. Exiting.")
80
+ sys.exit(1)
81
+ logger.info(f"GPU: {torch.cuda.get_device_name(0)}")
82
+
83
+ # ======================================================================================
84
+ # SETTINGS PERSISTENCE
85
+ # ======================================================================================
86
+
87
+ SETTINGS_FILE = "settings.json"
88
+ PROMPTS_INI = "prompts.ini"
89
+ STYLES_CSS = "styles.css"
90
+
91
+ DEFAULT_SETTINGS: Dict[str, Any] = {
92
+ "cfg_scale": 5.8,
93
+ "top_k": 250,
94
+ "top_p": 0.95,
95
+ "temperature": 0.90,
96
+ "total_duration": 60,
97
+ "bpm": 120,
98
+ "drum_beat": "none",
99
+ "synthesizer": "none",
100
+ "rhythmic_steps": "none",
101
+ "bass_style": "none",
102
+ "guitar_style": "none",
103
+ "target_volume": -23.0,
104
+ "preset": "default",
105
+ "max_steps": 1500,
106
+ "bitrate": "192k",
107
+ "output_sample_rate": "48000",
108
+ "bit_depth": "16",
109
+ "instrumental_prompt": ""
110
+ }
111
+
112
+ def load_settings() -> Dict[str, Any]:
113
+ try:
114
+ if os.path.exists(SETTINGS_FILE):
115
+ with open(SETTINGS_FILE, "r") as f:
116
+ data = json.load(f)
117
+ for k, v in DEFAULT_SETTINGS.items():
118
+ data.setdefault(k, v)
119
+ logger.info(f"Loaded settings from {SETTINGS_FILE}")
120
+ return data
121
+ except Exception as e:
122
+ logger.error(f"Settings load failed: {e}")
123
+ return DEFAULT_SETTINGS.copy()
124
+
125
+ def save_settings(s: Dict[str, Any]) -> None:
126
+ try:
127
+ with open(SETTINGS_FILE, "w") as f:
128
+ json.dump(s, f, indent=2)
129
+ logger.info(f"Saved settings to {SETTINGS_FILE}")
130
+ except Exception as e:
131
+ logger.error(f"Settings save failed: {e}")
132
+
133
+ SETTINGS = load_settings()
134
+
135
+ # ======================================================================================
136
+ # PROMPT CONFIG (prompts.ini)
137
+ # ======================================================================================
138
+
139
+ def _csv_list(s: str) -> List[str]:
140
+ if not s or s.strip().lower() == "none":
141
+ return []
142
+ return [x.strip() for x in s.split(",") if x.strip()]
143
+
144
+ PROMPT_CFG = configparser.ConfigParser()
145
+ if not os.path.exists(PROMPTS_INI):
146
+ PROMPT_CFG["metallica"] = {
147
+ "bpm_min": "90", "bpm_max": "140",
148
+ "drum_beat": "standard rock,techno kick",
149
+ "synthesizer": "none",
150
+ "rhythmic_steps": "steady steps,complex steps",
151
+ "bass_style": "deep bass,melodic bass",
152
+ "guitar_style": "distorted",
153
+ "api_name": "/set_classic_rock_prompt"
154
+ }
155
+ with open(PROMPTS_INI, "w") as f:
156
+ PROMPT_CFG.write(f)
157
+ PROMPT_CFG.read(PROMPTS_INI)
158
+
159
+ def list_genres() -> List[str]:
160
+ return PROMPT_CFG.sections()
161
+
162
+ def get_api_aliases() -> Dict[str, str]:
163
+ out = {}
164
+ for sec in PROMPT_CFG.sections():
165
+ api_name = PROMPT_CFG.get(sec, "api_name", fallback="").strip()
166
+ if api_name:
167
+ out[api_name] = sec
168
+ return out
169
+
170
+ def _humanize(name: str) -> str:
171
+ return name.replace("_", " ").title()
172
+
173
+ def build_prompt_from_section(
174
+ section: str,
175
+ bpm: Optional[int] = None,
176
+ drum_beat: Optional[str] = None,
177
+ synthesizer: Optional[str] = None,
178
+ rhythmic_steps: Optional[str] = None,
179
+ bass_style: Optional[str] = None,
180
+ guitar_style: Optional[str] = None
181
+ ) -> str:
182
+ if section not in PROMPT_CFG:
183
+ return f"Instrumental track at 120 BPM."
184
+ cfg = PROMPT_CFG[section]
185
+ bpm_min = cfg.getint("bpm_min", fallback=100)
186
+ bpm_max = cfg.getint("bpm_max", fallback=130)
187
+ bpm = bpm if bpm else random.randint(bpm_min, bpm_max)
188
+ bpm = max(bpm_min, min(bpm_max, bpm))
189
+
190
+ def pick(value: Optional[str], pool_key: str) -> str:
191
+ pool = _csv_list(cfg.get(pool_key, fallback=""))
192
+ if not pool:
193
+ return "" if (not value or value == "none") else f", {value}"
194
+ if (not value) or value == "none" or value not in pool:
195
+ choice = random.choice(pool)
196
+ return "" if choice == "none" else f", {choice}"
197
+ return f", {value}"
198
+
199
+ drum = pick(drum_beat, "drum_beat")
200
+ synth = pick(synthesizer, "synthesizer")
201
+ steps = pick(rhythmic_steps, "rhythmic_steps")
202
+ bass = pick(bass_style, "bass_style")
203
+ guitar = pick(guitar_style, "guitar_style")
204
+
205
+ styles_csv = cfg.get("styles", fallback="").strip()
206
+ styles_str = ""
207
+ if styles_csv:
208
+ styles = _csv_list(styles_csv)
209
+ if styles:
210
+ styles_str = ", " + ", ".join(styles)
211
+
212
+ label = _humanize(section)
213
+ if "star_wars" in section or "classical" in section:
214
+ return (
215
+ f"Cinematic orchestral score{styles_str}{drum}{synth}{steps}{bass}{guitar}, "
216
+ f"space-opera energy, sweeping strings, heroic brass, bold timpani at {bpm} BPM."
217
+ )
218
+ return (
219
+ f"Instrumental {label}{guitar}{bass}{drum}{synth}{steps} at {bpm} BPM, "
220
+ f"dynamic sections (intro/verse/chorus), cohesive song flow."
221
+ )
222
+
223
+ # ======================================================================================
224
+ # VRAM / DISK / CLEANUP
225
+ # ======================================================================================
226
+
227
+ def clean_memory() -> Optional[float]:
228
+ try:
229
+ torch.cuda.empty_cache()
230
+ gc.collect()
231
+ torch.cuda.ipc_collect()
232
+ torch.cuda.synchronize()
233
+ return torch.cuda.memory_allocated() / 1024**2
234
+ except Exception as e:
235
+ logger.error(f"clean_memory failed: {e}")
236
+ return None
237
+
238
+ def check_vram():
239
+ try:
240
+ r = subprocess.run(
241
+ ['nvidia-smi', '--query-gpu=memory.used,memory.total', '--format=csv'],
242
+ capture_output=True, text=True
243
+ )
244
+ lines = r.stdout.splitlines()
245
+ if len(lines) > 1:
246
+ used_mb, total_mb = map(int, re.findall(r'\d+', lines[1]))
247
+ free_mb = total_mb - used_mb
248
+ logger.info(f"VRAM: used {used_mb} MiB | free {free_mb} MiB | total {total_mb} MiB")
249
+ return free_mb
250
+ except Exception as e:
251
+ logger.error(f"check_vram failed: {e}")
252
+ return None
253
+
254
+ def check_disk_space(path=".") -> bool:
255
+ try:
256
+ stat = os.statvfs(path)
257
+ free_gb = stat.f_bavail * stat.f_frsize / (1024**3)
258
+ if free_gb < 1.0:
259
+ logger.warning(f"Low disk space: {free_gb:.2f} GB")
260
+ return free_gb >= 1.0
261
+ except Exception as e:
262
+ logger.error(f"Disk space check failed: {e}")
263
+ return False
264
+
265
+ # ======================================================================================
266
+ # MODEL LOAD
267
+ # ======================================================================================
268
+
269
+ try:
270
+ from audiocraft.models import MusicGen
271
+ except Exception as e:
272
+ logger.error("audiocraft is required. pip install audiocraft")
273
+ raise
274
+
275
+ def load_model():
276
+ free_vram = check_vram()
277
+ if free_vram is not None and free_vram < 5000:
278
+ logger.warning("Low free VRAM; consider closing other GPU apps.")
279
+ clean_memory()
280
+ local_model_path = "./models/musicgen-large"
281
+ if not os.path.exists(local_model_path):
282
+ logger.error(f"Missing weights at {local_model_path}")
283
+ sys.exit(1)
284
+ logger.info("Loading MusicGen (large)...")
285
+ with autocast(dtype=torch.float16):
286
+ model = MusicGen.get_pretrained(local_model_path, device=DEVICE)
287
+ model.set_generation_params(duration=30, two_step_cfg=False)
288
+ logger.info("MusicGen loaded.")
289
+ return model
290
+
291
+ musicgen_model = load_model()
292
+
293
+ # ======================================================================================
294
+ # AUDIO DSP
295
+ # ======================================================================================
296
+
297
+ def ensure_stereo(seg: AudioSegment, sample_rate=48000, sample_width=2) -> AudioSegment:
298
+ try:
299
+ if seg.channels != 2:
300
+ seg = seg.set_channels(2)
301
+ if seg.frame_rate != sample_rate:
302
+ seg = seg.set_frame_rate(sample_rate)
303
+ return seg
304
+ except Exception:
305
+ return seg
306
+
307
+ def calculate_rms(seg: AudioSegment) -> float:
308
+ try:
309
+ samples = np.array(seg.get_array_of_samples(), dtype=np.float32)
310
+ return float(np.sqrt(np.mean(samples**2)))
311
+ except Exception:
312
+ return 0.0
313
+
314
+ def hard_limit(seg: AudioSegment, limit_db=-3.0, sample_rate=48000) -> AudioSegment:
315
+ try:
316
+ seg = ensure_stereo(seg, sample_rate, seg.sample_width)
317
+ limit = 10 ** (limit_db / 20.0) * (2**23 if seg.sample_width == 3 else 32767)
318
+ x = np.array(seg.get_array_of_samples(), dtype=np.float32)
319
+ x = np.clip(x, -limit, limit).astype(np.int32 if seg.sample_width == 3 else np.int16)
320
+ if len(x) % 2 != 0:
321
+ x = x[:-1]
322
+ return AudioSegment(x.tobytes(), frame_rate=sample_rate, sample_width=seg.sample_width, channels=2)
323
+ except Exception:
324
+ return seg
325
+
326
+ def rms_normalize(seg: AudioSegment, target_rms_db=-23.0, peak_limit_db=-3.0, sample_rate=48000) -> AudioSegment:
327
+ try:
328
+ seg = ensure_stereo(seg, sample_rate, seg.sample_width)
329
+ target = 10 ** (target_rms_db / 20) * (2**23 if seg.sample_width == 3 else 32767)
330
+ current = calculate_rms(seg)
331
+ if current > 0:
332
+ gain = target / current
333
+ seg = seg.apply_gain(20 * np.log10(max(gain, 1e-6)))
334
+ seg = hard_limit(seg, limit_db=peak_limit_db, sample_rate=sample_rate)
335
+ return seg
336
+ except Exception:
337
+ return seg
338
+
339
+ def balance_stereo(seg: AudioSegment, noise_threshold=-40, sample_rate=48000) -> AudioSegment:
340
+ try:
341
+ seg = ensure_stereo(seg, sample_rate, seg.sample_width)
342
+ x = np.array(seg.get_array_of_samples(), dtype=np.float32)
343
+ stereo = x.reshape(-1, 2)
344
+ db = 20 * np.log10(np.abs(stereo) + 1e-10)
345
+ mask = db > noise_threshold
346
+ stereo = stereo * mask
347
+ L, R = stereo[:, 0], stereo[:, 1]
348
+ l_rms = np.sqrt(np.mean(L[L != 0] ** 2)) if np.any(L != 0) else 0
349
+ r_rms = np.sqrt(np.mean(R[R != 0] ** 2)) if np.any(R != 0) else 0
350
+ if l_rms > 0 and r_rms > 0:
351
+ avg = (l_rms + r_rms) / 2
352
+ stereo[:, 0] *= (avg / l_rms)
353
+ stereo[:, 1] *= (avg / r_rms)
354
+ out = stereo.flatten().astype(np.int32 if seg.sample_width == 3 else np.int16)
355
+ if len(out) % 2 != 0:
356
+ out = out[:-1]
357
+ return AudioSegment(out.tobytes(), frame_rate=sample_rate, sample_width=seg.sample_width, channels=2)
358
+ except Exception:
359
+ return seg
360
+
361
+ def apply_noise_gate(seg: AudioSegment, threshold_db=-80, sample_rate=48000) -> AudioSegment:
362
+ try:
363
+ seg = ensure_stereo(seg, sample_rate, seg.sample_width)
364
+ x = np.array(seg.get_array_of_samples(), dtype=np.float32)
365
+ stereo = x.reshape(-1, 2)
366
+ for _ in range(2):
367
+ db = 20 * np.log10(np.abs(stereo) + 1e-10)
368
+ mask = db > threshold_db
369
+ stereo = stereo * mask
370
+ out = stereo.flatten().astype(np.int32 if seg.sample_width == 3 else np.int16)
371
+ if len(out) % 2 != 0:
372
+ out = out[:-1]
373
+ return AudioSegment(out.tobytes(), frame_rate=sample_rate, sample_width=seg.sample_width, channels=2)
374
+ except Exception:
375
+ return seg
376
+
377
+ def apply_eq(seg: AudioSegment, sample_rate=48000) -> AudioSegment:
378
+ try:
379
+ seg = ensure_stereo(seg, sample_rate, seg.sample_width)
380
+ seg = seg.high_pass_filter(20).low_pass_filter(8000)
381
+ seg = seg - 3
382
+ seg = seg - 3
383
+ seg = seg - 10
384
+ return seg
385
+ except Exception:
386
+ return seg
387
+
388
+ def apply_fade(seg: AudioSegment, fade_in_ms=500, fade_out_ms=800) -> AudioSegment:
389
+ try:
390
+ seg = ensure_stereo(seg, seg.frame_rate, seg.sample_width)
391
+ return seg.fade_in(fade_in_ms).fade_out(fade_out_ms)
392
+ except Exception:
393
+ return seg
394
+
395
+ def _export_tensor_to_segment(audio: torch.Tensor, sr: int, bit_depth: int) -> Optional[AudioSegment]:
396
+ tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".wav")
397
+ tmp_path = tmp.name
398
+ tmp.close()
399
+ try:
400
+ torchaudio.save(tmp_path, audio, sr, bits_per_sample=bit_depth)
401
+ with open(tmp_path, "rb") as f:
402
+ mm = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
403
+ seg = AudioSegment.from_wav(tmp_path)
404
+ mm.close()
405
+ return seg
406
+ except Exception as e:
407
+ logger.error(f"export tensor -> segment failed: {e}")
408
+ return None
409
+ finally:
410
+ try:
411
+ if os.path.exists(tmp_path): os.unlink(tmp_path)
412
+ except OSError:
413
+ pass
414
+
415
+ def _crossfade(seg_a: AudioSegment, seg_b: AudioSegment, overlap_ms: int, sr: int, bit_depth: int) -> AudioSegment:
416
+ try:
417
+ seg_a = ensure_stereo(seg_a, sr, seg_a.sample_width)
418
+ seg_b = ensure_stereo(seg_b, sr, seg_b.sample_width)
419
+ if overlap_ms <= 0 or len(seg_a) < overlap_ms or len(seg_b) < overlap_ms:
420
+ return seg_a + seg_b
421
+
422
+ with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as a_wav, \
423
+ tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as b_wav, \
424
+ tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as cf_wav:
425
+ a_path = a_wav.name
426
+ b_path = b_wav.name
427
+ cf_path = cf_wav.name
428
+
429
+ seg_a[-overlap_ms:].export(a_path, format="wav")
430
+ seg_b[:overlap_ms].export(b_path, format="wav")
431
+ a, sr_a = torchaudio.load(a_path)
432
+ b, sr_b = torchaudio.load(b_path)
433
+ if sr_a != sr:
434
+ a = torchaudio.functional.resample(a, sr_a, sr, lowpass_filter_width=64)
435
+ if sr_b != sr:
436
+ b = torchaudio.functional.resample(b, sr_b, sr, lowpass_filter_width=64)
437
+ n = min(a.shape[1], b.shape[1])
438
+ n = n - (n % 2)
439
+ if n <= 0:
440
+ for p in (a_path, b_path, cf_path):
441
+ try:
442
+ if os.path.exists(p): os.unlink(p)
443
+ except OSError:
444
+ pass
445
+ return seg_a + seg_b
446
+ aw = a[:, :n].to(torch.float32)
447
+ bw = b[:, :n].to(torch.float32)
448
+ hann = torch.hann_window(n, periodic=False)
449
+ out = (aw * hann.flip(0) + bw * hann).clamp(-1.0, 1.0)
450
+ scale = (2**23 if bit_depth == 24 else 32767)
451
+ out_i = (out * scale).to(torch.int32 if bit_depth == 24 else torch.int16)
452
+ torchaudio.save(cf_path, out_i, sr, bits_per_sample=bit_depth)
453
+ blended = AudioSegment.from_wav(cf_path)
454
+ res = seg_a[:-overlap_ms] + blended + seg_b[overlap_ms:]
455
+ for p in (a_path, b_path, cf_path):
456
+ try:
457
+ if os.path.exists(p): os.unlink(p)
458
+ except OSError:
459
+ pass
460
+ return res
461
+ except Exception as e:
462
+ logger.error(f"crossfade failed: {e}")
463
+ return seg_a + seg_b
464
+
465
+ # ======================================================================================
466
+ # GENERATION (30s chunks -> seamless)
467
+ # ======================================================================================
468
+
469
+ def generate_music(
470
+ instrumental_prompt: str,
471
+ cfg_scale: float,
472
+ top_k: int,
473
+ top_p: float,
474
+ temperature: float,
475
+ total_duration: int,
476
+ bpm: int,
477
+ drum_beat: str,
478
+ synthesizer: str,
479
+ rhythmic_steps: str,
480
+ bass_style: str,
481
+ guitar_style: str,
482
+ target_volume: float,
483
+ preset: str,
484
+ max_steps_ignored: str,
485
+ vram_status_text: str,
486
+ bitrate: str,
487
+ output_sample_rate: str,
488
+ bit_depth: str
489
+ ) -> Tuple[Optional[str], str, str]:
490
+ if not instrumental_prompt or not instrumental_prompt.strip():
491
+ return None, "⚠️ Enter a valid prompt.", vram_status_text
492
+
493
+ try:
494
+ out_sr = int(output_sample_rate)
495
+ bit_depth_int = int(bit_depth)
496
+ sample_width = 3 if bit_depth_int == 24 else 2
497
+ except Exception:
498
+ return None, "❌ Invalid output SR or bit depth.", vram_status_text
499
+
500
+ if not check_disk_space("."):
501
+ return None, "⚠️ Low disk space (<1GB).", vram_status_text
502
+
503
+ CHUNK = 30
504
+ total_duration = max(30, min(int(total_duration), 180))
505
+ chunks = math.ceil(total_duration / CHUNK)
506
+ PROCESS_SR = 48000
507
+ OVERLAP = 0.20
508
+
509
+ musicgen_model.set_generation_params(
510
+ duration=CHUNK,
511
+ use_sampling=True,
512
+ top_k=int(top_k),
513
+ top_p=float(top_p),
514
+ temperature=float(temperature),
515
+ cfg_coef=float(cfg_scale),
516
+ two_step_cfg=False,
517
+ )
518
+
519
+ vram_status_text = f"Start VRAM: {torch.cuda.memory_allocated() / 1024**2:.2f} MB"
520
+ segments: List[AudioSegment] = []
521
+
522
+ seed = random.randint(0, 2**31 - 1)
523
+ random.seed(seed); np.random.seed(seed)
524
+ torch.manual_seed(seed); torch.cuda.manual_seed_all(seed)
525
+
526
+ for i in range(chunks):
527
+ part = i + 1
528
+ dur = CHUNK if (i < chunks - 1) else (total_duration - CHUNK * (chunks - 1) or CHUNK)
529
+ logger.info(f"Generating chunk {part}/{chunks} ({dur}s)")
530
+ chunk_prompt = instrumental_prompt
531
+
532
+ try:
533
+ with torch.no_grad():
534
+ with autocast(dtype=torch.float16):
535
+ clean_memory()
536
+ if i == 0:
537
+ audio = musicgen_model.generate([chunk_prompt], progress=True)[0].cpu()
538
+ else:
539
+ prev = segments[-1]
540
+ prev = apply_noise_gate(prev, -80, PROCESS_SR)
541
+ prev = balance_stereo(prev, -40, PROCESS_SR)
542
+ with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as tprev:
543
+ prev_path = tprev.name
544
+ prev.export(prev_path, format="wav")
545
+ tail, sr_prev = torchaudio.load(prev_path)
546
+ if sr_prev != PROCESS_SR:
547
+ tail = torchaudio.functional.resample(tail, sr_prev, PROCESS_SR, lowpass_filter_width=64)
548
+ if tail.shape[0] != 2:
549
+ tail = tail.repeat(2, 1)[:, :tail.shape[1]]
550
+ try:
551
+ os.unlink(prev_path)
552
+ except OSError:
553
+ pass
554
+ tail = tail.to(DEVICE)[:, -int(PROCESS_SR * OVERLAP):]
555
+ audio = musicgen_model.generate_continuation(
556
+ prompt=tail,
557
+ prompt_sample_rate=PROCESS_SR,
558
+ descriptions=[chunk_prompt],
559
+ progress=True
560
+ )[0].cpu()
561
+ clean_memory()
562
+ except Exception as e:
563
+ logger.error(f"Chunk {part} generation failed: {e}")
564
+ return None, f"❌ Failed to generate chunk {part}: {e}", vram_status_text
565
+
566
+ try:
567
+ if audio.shape[0] != 2:
568
+ audio = audio.repeat(2, 1)[:, :audio.shape[1]]
569
+ audio = audio.to(torch.float32)
570
+ audio = torchaudio.functional.resample(audio, 32000, PROCESS_SR, lowpass_filter_width=64)
571
+ seg = _export_tensor_to_segment(audio, PROCESS_SR, bit_depth_int)
572
+ if seg is None:
573
+ return None, f"❌ Audio conversion failed (chunk {part}).", vram_status_text
574
+ seg = ensure_stereo(seg, PROCESS_SR, sample_width)
575
+ seg = seg - 15
576
+ seg = apply_noise_gate(seg, -80, PROCESS_SR)
577
+ seg = balance_stereo(seg, -40, PROCESS_SR)
578
+ seg = rms_normalize(seg, target_rms_db=target_volume, peak_limit_db=-3.0, sample_rate=PROCESS_SR)
579
+ seg = apply_eq(seg, PROCESS_SR)
580
+ seg = seg[:dur * 1000]
581
+ segments.append(seg)
582
+ del audio
583
+ vram_status_text = f"VRAM after chunk {part}: {torch.cuda.memory_allocated() / 1024**2:.2f} MB"
584
+ except Exception as e:
585
+ logger.error(f"Post-process failed (chunk {part}): {e}")
586
+ return None, f"❌ Processing error (chunk {part}).", vram_status_text
587
+
588
+ if not segments:
589
+ return None, "❌ No audio generated.", vram_status_text
590
+
591
+ logger.info("Combining chunks...")
592
+ out = segments[0]
593
+ overlap_ms = int(OVERLAP * 1000)
594
+ for k in range(1, len(segments)):
595
+ out = _crossfade(out, segments[k], overlap_ms, PROCESS_SR, bit_depth_int)
596
+
597
+ out = out[:total_duration * 1000]
598
+ out = apply_noise_gate(out, -80, PROCESS_SR)
599
+ out = balance_stereo(out, -40, PROCESS_SR)
600
+ out = rms_normalize(out, target_rms_db=target_volume, peak_limit_db=-3.0, sample_rate=PROCESS_SR)
601
+ out = apply_eq(out, PROCESS_SR)
602
+ out = apply_fade(out, 500, 800)
603
+ out = (out - 10).set_frame_rate(out_sr)
604
+
605
+ mp3_path = os.path.join(MP3_DIR, f"ghostai_music_{int(time.time())}.mp3")
606
+ try:
607
+ clean_memory()
608
+ out.export(mp3_path, format="mp3", bitrate=bitrate, tags={"title": "GhostAI Instrumental", "artist": "GhostAI"})
609
+ except Exception as e:
610
+ logger.error(f"MP3 export failed: {e}")
611
+ fb = os.path.join(MP3_DIR, f"ghostai_music_fallback_{int(time.time())}.mp3")
612
+ try:
613
+ out.export(fb, format="mp3", bitrate="128k")
614
+ mp3_path = fb
615
+ except Exception as ee:
616
+ return None, f"❌ Export failed: {ee}", vram_status_text
617
+
618
+ vram_status_text = f"Final VRAM: {torch.cuda.memory_allocated() / 1024**2:.2f} MB"
619
+ return mp3_path, "βœ… Done! Seamless unified track rendered.", vram_status_text
620
+
621
+ def generate_music_wrapper(*args):
622
+ try:
623
+ return generate_music(*args)
624
+ finally:
625
+ clean_memory()
626
+
627
+ # ======================================================================================
628
+ # FASTAPI β€” Status + Settings + Prompts + Render
629
+ # ======================================================================================
630
+
631
+ class RenderRequest(BaseModel):
632
+ instrumental_prompt: str
633
+ cfg_scale: Optional[float] = None
634
+ top_k: Optional[int] = None
635
+ top_p: Optional[float] = None
636
+ temperature: Optional[float] = None
637
+ total_duration: Optional[int] = None
638
+ bpm: Optional[int] = None
639
+ drum_beat: Optional[str] = None
640
+ synthesizer: Optional[str] = None
641
+ rhythmic_steps: Optional[str] = None
642
+ bass_style: Optional[str] = None
643
+ guitar_style: Optional[str] = None
644
+ target_volume: Optional[float] = None
645
+ preset: Optional[str] = None
646
+ max_steps: Optional[int] = None
647
+ bitrate: Optional[str] = None
648
+ output_sample_rate: Optional[str] = None
649
+ bit_depth: Optional[str] = None
650
+
651
+ class SettingsUpdate(BaseModel):
652
+ settings: Dict[str, Any]
653
+
654
+ BUSY_LOCK = threading.Lock()
655
+ BUSY_FLAG = False
656
+ CURRENT_JOB: Dict[str, Any] = {"id": None, "start": None}
657
+
658
+ def set_busy(val: bool, job_id: Optional[str] = None):
659
+ global BUSY_FLAG, CURRENT_JOB
660
+ with BUSY_LOCK:
661
+ BUSY_FLAG = val
662
+ if val:
663
+ CURRENT_JOB["id"] = job_id or f"job_{int(time.time())}"
664
+ CURRENT_JOB["start"] = time.time()
665
+ else:
666
+ CURRENT_JOB["id"] = None
667
+ CURRENT_JOB["start"] = None
668
+
669
+ def is_busy() -> bool:
670
+ with BUSY_LOCK:
671
+ return BUSY_FLAG
672
+
673
+ def job_elapsed() -> float:
674
+ with BUSY_LOCK:
675
+ if CURRENT_JOB["start"] is None:
676
+ return 0.0
677
+ return time.time() - CURRENT_JOB["start"]
678
+
679
+ fastapp = FastAPI(title="GhostAI Music Server", version="1.2")
680
+ fastapp.add_middleware(
681
+ CORSMiddleware, allow_origins=["*"], allow_credentials=True, allow_methods=["*"], allow_headers=["*"]
682
+ )
683
+
684
+ @fastapp.get("/health")
685
+ def health():
686
+ return {"ok": True, "ts": int(time.time())}
687
+
688
+ @fastapp.get("/status")
689
+ def status():
690
+ return {"busy": is_busy(), "job_id": CURRENT_JOB["id"], "elapsed": job_elapsed()}
691
+
692
+ @fastapp.get("/config")
693
+ def get_config():
694
+ return {"defaults": SETTINGS}
695
+
696
+ @fastapp.post("/settings")
697
+ def set_settings(payload: SettingsUpdate):
698
+ try:
699
+ s = SETTINGS.copy()
700
+ s.update(payload.settings or {})
701
+ save_settings(s)
702
+ for k, v in s.items():
703
+ SETTINGS[k] = v
704
+ return {"ok": True, "saved": s}
705
+ except Exception as e:
706
+ raise HTTPException(status_code=400, detail=str(e))
707
+
708
+ @fastapp.get("/genres")
709
+ def api_genres():
710
+ return {"genres": list_genres()}
711
+
712
+ @fastapp.post("/reload_prompts")
713
+ def api_reload_prompts():
714
+ try:
715
+ PROMPT_CFG.read(PROMPTS_INI)
716
+ return {"ok": True, "genres": list_genres(), "aliases": get_api_aliases()}
717
+ except Exception as e:
718
+ raise HTTPException(status_code=500, detail=str(e))
719
+
720
+ @fastapp.get("/prompt/{name}")
721
+ def api_prompt(
722
+ name: str,
723
+ bpm: Optional[int] = Query(None),
724
+ drum_beat: Optional[str] = Query(None),
725
+ synthesizer: Optional[str] = Query(None),
726
+ rhythmic_steps: Optional[str] = Query(None),
727
+ bass_style: Optional[str] = Query(None),
728
+ guitar_style: Optional[str] = Query(None),
729
+ ):
730
+ if name not in PROMPT_CFG:
731
+ raise HTTPException(status_code=404, detail=f"Unknown genre '{name}'.")
732
+ prompt = build_prompt_from_section(name, bpm, drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style)
733
+ return {"name": name, "prompt": prompt}
734
+
735
+ @fastapp.post("/render")
736
+ def render(req: RenderRequest):
737
+ if is_busy():
738
+ raise HTTPException(status_code=409, detail="Server busy")
739
+ job_id = f"render_{int(time.time())}"
740
+ set_busy(True, job_id)
741
+ try:
742
+ s = SETTINGS.copy()
743
+ for k, v in req.dict().items():
744
+ if v is not None:
745
+ s[k] = v
746
+ mp3, msg, vram = generate_music(
747
+ s.get("instrumental_prompt", req.instrumental_prompt),
748
+ float(s.get("cfg_scale", DEFAULT_SETTINGS["cfg_scale"])),
749
+ int(s.get("top_k", DEFAULT_SETTINGS["top_k"])),
750
+ float(s.get("top_p", DEFAULT_SETTINGS["top_p"])),
751
+ float(s.get("temperature", DEFAULT_SETTINGS["temperature"])),
752
+ int(s.get("total_duration", DEFAULT_SETTINGS["total_duration"])),
753
+ int(s.get("bpm", DEFAULT_SETTINGS["bpm"])),
754
+ str(s.get("drum_beat", DEFAULT_SETTINGS["drum_beat"])),
755
+ str(s.get("synthesizer", DEFAULT_SETTINGS["synthesizer"])),
756
+ str(s.get("rhythmic_steps", DEFAULT_SETTINGS["rhythmic_steps"])),
757
+ str(s.get("bass_style", DEFAULT_SETTINGS["bass_style"])),
758
+ str(s.get("guitar_style", DEFAULT_SETTINGS["guitar_style"])),
759
+ float(s.get("target_volume", DEFAULT_SETTINGS["target_volume"])),
760
+ str(s.get("preset", DEFAULT_SETTINGS["preset"])),
761
+ str(s.get("max_steps", DEFAULT_SETTINGS["max_steps"])),
762
+ "",
763
+ str(s.get("bitrate", DEFAULT_SETTINGS["bitrate"])),
764
+ str(s.get("output_sample_rate", DEFAULT_SETTINGS["output_sample_rate"])),
765
+ str(s.get("bit_depth", DEFAULT_SETTINGS["bit_depth"]))
766
+ )
767
+ if not mp3:
768
+ raise HTTPException(status_code=500, detail=msg)
769
+ return {"ok": True, "job_id": job_id, "path": mp3, "status": msg, "vram": vram}
770
+ finally:
771
+ set_busy(False, None)
772
+
773
+ for path, sec in get_api_aliases().items():
774
+ def _factory(section_name: str):
775
+ def _endpoint(
776
+ bpm: Optional[int] = Query(None),
777
+ drum_beat: Optional[str] = Query(None),
778
+ synthesizer: Optional[str] = Query(None),
779
+ rhythmic_steps: Optional[str] = Query(None),
780
+ bass_style: Optional[str] = Query(None),
781
+ guitar_style: Optional[str] = Query(None),
782
+ ):
783
+ prompt = build_prompt_from_section(section_name, bpm, drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style)
784
+ return {"name": section_name, "prompt": prompt}
785
+ return _endpoint
786
+ fastapp.add_api_route(path, _factory(sec), methods=["GET"])
787
+
788
+ def _start_fastapi():
789
+ uvicorn.run(fastapp, host="0.0.0.0", port=8555, log_level="info")
790
+
791
+ api_thread = threading.Thread(target=_start_fastapi, daemon=True)
792
+ api_thread.start()
793
+ logger.info("FastAPI server on http://0.0.0.0:8555")
794
+
795
+ # ======================================================================================
796
+ # GRADIO UI
797
+ # ======================================================================================
798
+
799
+ def get_latest_log():
800
+ try:
801
+ if not os.path.exists(LOG_FILE):
802
+ return "No log file found."
803
+ with open(LOG_FILE, "r", encoding="utf-8", errors="ignore") as f:
804
+ return f.read()
805
+ except Exception as e:
806
+ return f"Error reading log file: {e}"
807
+
808
+ css_text = Path(STYLES_CSS).read_text(encoding="utf-8") if os.path.exists(STYLES_CSS) else ""
809
+
810
+ logger.info("Building Gradio UI...")
811
+ with gr.Blocks(css=css_text, analytics_enabled=False, title="GhostAI Music Generator") as demo:
812
+ gr.Markdown("# 🎡 GhostAI Music Generator")
813
+ gr.Markdown("Create instrumental tracks with fixed 30s chunking and seamless merges. Accessibility-first UI.")
814
+
815
+ with gr.Row():
816
+ with gr.Column():
817
+ instrumental_prompt = gr.Textbox(
818
+ label="Instrumental Prompt",
819
+ placeholder="Type your prompt or click a Genre button below",
820
+ lines=4,
821
+ value=SETTINGS.get("instrumental_prompt", ""),
822
+ )
823
+
824
+ genre_buttons = []
825
+ genre_sections = list_genres()
826
+ with gr.Group():
827
+ gr.Markdown("### Genres (from prompts.ini)")
828
+ for i in range(0, len(genre_sections), 4):
829
+ with gr.Row():
830
+ for sec in genre_sections[i:i+4]:
831
+ btn = gr.Button(_humanize(sec))
832
+ genre_buttons.append((btn, sec))
833
+
834
+ with gr.Group():
835
+ gr.Markdown("### Generation Settings")
836
+ cfg_scale = gr.Slider(1.0, 10.0, step=0.1, value=float(SETTINGS.get("cfg_scale", 5.8)), label="CFG Scale")
837
+ top_k = gr.Slider(10, 500, step=10, value=int(SETTINGS.get("top_k", 250)), label="Top-K")
838
+ top_p = gr.Slider(0.0, 1.0, step=0.01, value=float(SETTINGS.get("top_p", 0.95)), label="Top-P")
839
+ temperature = gr.Slider(0.1, 2.0, step=0.01, value=float(SETTINGS.get("temperature", 0.90)), label="Temperature")
840
+ total_duration = gr.Dropdown(choices=[30, 60, 90, 120, 180], value=int(SETTINGS.get("total_duration", 60)), label="Song Length (seconds)")
841
+
842
+ bpm = gr.Slider(60, 180, step=1, value=int(SETTINGS.get("bpm", 120)), label="Tempo (BPM)")
843
+ drum_beat = gr.Dropdown(choices=["none", "standard rock", "techno kick", "funk groove", "jazz swing", "orchestral percussion"], value=str(SETTINGS.get("drum_beat", "none")), label="Drum Beat")
844
+ synthesizer = gr.Dropdown(choices=["none", "analog synth", "digital pad", "arpeggiated synth", "strings", "brass", "choir"], value=str(SETTINGS.get("synthesizer", "none")), label="Synthesizer / Section")
845
+ rhythmic_steps = gr.Dropdown(choices=["none", "steady steps", "syncopated steps", "complex steps"], value=str(SETTINGS.get("rhythmic_steps", "none")), label="Rhythmic Steps")
846
+ bass_style = gr.Dropdown(choices=["none", "slap bass", "deep bass", "melodic bass", "contrabass ostinato"], value=str(SETTINGS.get("bass_style", "none")), label="Bass Style")
847
+ guitar_style = gr.Dropdown(choices=["none", "distorted", "clean", "jangle"], value=str(SETTINGS.get("guitar_style", "none")), label="Guitar Style")
848
+
849
+ target_volume = gr.Slider(-30.0, -20.0, step=0.5, value=float(SETTINGS.get("target_volume", -23.0)), label="Target Loudness (dBFS RMS)")
850
+ preset = gr.Dropdown(choices=["default", "rock", "techno", "grunge", "indie", "funk_rock"], value=str(SETTINGS.get("preset", "default")), label="Preset")
851
+ max_steps = gr.Dropdown(choices=[1000, 1200, 1300, 1500], value=int(SETTINGS.get("max_steps", 1500)), label="Max Steps (info)")
852
+ bitrate_state = gr.State(value=str(SETTINGS.get("bitrate", "192k")))
853
+ sample_rate_state = gr.State(value=str(SETTINGS.get("output_sample_rate", "48000")))
854
+ bit_depth_state = gr.State(value=str(SETTINGS.get("bit_depth", "16")))
855
+
856
+ with gr.Row():
857
+ br128 = gr.Button("Bitrate 128k")
858
+ br192 = gr.Button("Bitrate 192k")
859
+ br320 = gr.Button("Bitrate 320k")
860
+ with gr.Row():
861
+ sr22 = gr.Button("SR 22.05k")
862
+ sr44 = gr.Button("SR 44.1k")
863
+ sr48 = gr.Button("SR 48k")
864
+ with gr.Row():
865
+ bd16 = gr.Button("16-bit")
866
+ bd24 = gr.Button("24-bit")
867
+
868
+ gen_btn = gr.Button("Generate Music πŸš€")
869
+ clr_btn = gr.Button("Clear 🧹")
870
+ save_btn = gr.Button("Save Settings πŸ’Ύ")
871
+ load_btn = gr.Button("Load Settings πŸ“‚")
872
+ reset_btn = gr.Button("Reset Defaults ♻️")
873
+
874
+ with gr.Column():
875
+ gr.Markdown("### Output")
876
+ out_audio = gr.Audio(label="Generated Track (saved in ./mp3)", type="filepath")
877
+ status_box = gr.Textbox(label="Status", interactive=False)
878
+ vram_box = gr.Textbox(label="VRAM Usage", interactive=False, value="")
879
+ log_btn = gr.Button("View Log πŸ“‹")
880
+ log_output = gr.Textbox(label="Log Contents", lines=18, interactive=False)
881
+
882
+ def on_genre_click(sec, bpm_v, drum_v, synth_v, steps_v, bass_v, guitar_v):
883
+ return build_prompt_from_section(sec, bpm_v, drum_v, synth_v, steps_v, bass_v, guitar_v)
884
+
885
+ for btn, sec in genre_buttons:
886
+ btn.click(
887
+ on_genre_click,
888
+ inputs=[gr.State(sec), bpm, drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style],
889
+ outputs=instrumental_prompt
890
+ )
891
+
892
+ br128.click(lambda: "128k", outputs=bitrate_state)
893
+ br192.click(lambda: "192k", outputs=bitrate_state)
894
+ br320.click(lambda: "320k", outputs=bitrate_state)
895
+ sr22.click(lambda: "22050", outputs=sample_rate_state)
896
+ sr44.click(lambda: "44100", outputs=sample_rate_state)
897
+ sr48.click(lambda: "48000", outputs=sample_rate_state)
898
+ bd16.click(lambda: "16", outputs=bit_depth_state)
899
+ bd24.click(lambda: "24", outputs=bit_depth_state)
900
+
901
+ gen_btn.click(
902
+ generate_music_wrapper,
903
+ inputs=[
904
+ instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
905
+ drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style,
906
+ target_volume, preset, max_steps, vram_box, bitrate_state, sample_rate_state, bit_depth_state
907
+ ],
908
+ outputs=[out_audio, status_box, vram_box]
909
+ )
910
+
911
+ def clear_inputs():
912
+ s = DEFAULT_SETTINGS.copy()
913
+ return (
914
+ s["instrumental_prompt"], s["cfg_scale"], s["top_k"], s["top_p"], s["temperature"],
915
+ s["total_duration"], s["bpm"], s["drum_beat"], s["synthesizer"], s["rhythmic_steps"],
916
+ s["bass_style"], s["guitar_style"], s["target_volume"], s["preset"], s["max_steps"],
917
+ s["bitrate"], s["output_sample_rate"], s["bit_depth"]
918
+ )
919
+ clr_btn.click(
920
+ clear_inputs,
921
+ outputs=[
922
+ instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
923
+ drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style, target_volume,
924
+ preset, max_steps, bitrate_state, sample_rate_state, bit_depth_state
925
+ ]
926
+ )
927
+
928
+ def _save_action(ip, cs, tk, tp, tt, dur, bpm_v, d, s, rs, b, g, tv, pr, ms, br, sr, bd):
929
+ data = {
930
+ "instrumental_prompt": ip, "cfg_scale": float(cs), "top_k": int(tk), "top_p": float(tp),
931
+ "temperature": float(tt), "total_duration": int(dur), "bpm": int(bpm_v),
932
+ "drum_beat": str(d), "synthesizer": str(s), "rhythmic_steps": str(rs),
933
+ "bass_style": str(b), "guitar_style": str(g), "target_volume": float(tv),
934
+ "preset": str(pr), "max_steps": int(ms), "bitrate": str(br),
935
+ "output_sample_rate": str(sr), "bit_depth": str(bd)
936
+ }
937
+ save_settings(data)
938
+ for k, v in data.items(): SETTINGS[k] = v
939
+ return "βœ… Settings saved."
940
+ save_btn.click(
941
+ _save_action,
942
+ inputs=[instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
943
+ drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style, target_volume,
944
+ preset, max_steps, bitrate_state, sample_rate_state, bit_depth_state],
945
+ outputs=status_box
946
+ )
947
+
948
+ def _load_action():
949
+ s = load_settings()
950
+ for k, v in s.items(): SETTINGS[k] = v
951
+ return (
952
+ s["instrumental_prompt"], s["cfg_scale"], s["top_k"], s["top_p"], s["temperature"],
953
+ s["total_duration"], s["bpm"], s["drum_beat"], s["synthesizer"], s["rhythmic_steps"],
954
+ s["bass_style"], s["guitar_style"], s["target_volume"], s["preset"], s["max_steps"],
955
+ s["bitrate"], s["output_sample_rate"], s["bit_depth"],
956
+ "βœ… Settings loaded."
957
+ )
958
+ load_btn.click(_load_action,
959
+ outputs=[instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
960
+ drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style, target_volume,
961
+ preset, max_steps, bitrate_state, sample_rate_state, bit_depth_state, status_box]
962
+ )
963
+
964
+ def _reset_action():
965
+ s = DEFAULT_SETTINGS.copy()
966
+ save_settings(s)
967
+ for k, v in s.items(): SETTINGS[k] = v
968
+ return (
969
+ s["instrumental_prompt"], s["cfg_scale"], s["top_k"], s["top_p"], s["temperature"],
970
+ s["total_duration"], s["bpm"], s["drum_beat"], s["synthesizer"], s["rhythmic_steps"],
971
+ s["bass_style"], s["guitar_style"], s["target_volume"], s["preset"], s["max_steps"],
972
+ s["bitrate"], s["output_sample_rate"], s["bit_depth"], "βœ… Defaults restored."
973
+ )
974
+ reset_btn.click(_reset_action,
975
+ outputs=[instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
976
+ drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style, target_volume,
977
+ preset, max_steps, bitrate_state, sample_rate_state, bit_depth_state, status_box]
978
+ )
979
+
980
+ log_btn.click(get_latest_log, outputs=log_output)
981
+
982
+ logger.info("Launching Gradio UI at http://0.0.0.0:9999 ...")
983
+ try:
984
+ demo.launch(server_name="0.0.0.0", server_port=9999, share=False, inbrowser=False, show_error=True)
985
+ except Exception as e:
986
+ logger.error(f"Gradio launch failed: {e}")
987
+ logger.error(traceback.format_exc())
988
+ sys.exit(1)
public/settings.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "instrumental_prompt": "Instrumental alternative rock by Red Hot Chili Peppers, energetic guitar riffs, funky slap bass, standard rock drums with funk fills, funk-rock energy at 126 BPM, dynamic intro and expressive verse.",
3
+ "cfg_scale": 5.8,
4
+ "top_k": 250,
5
+ "top_p": 0.95,
6
+ "temperature": 0.9,
7
+ "total_duration": 60,
8
+ "bpm": 120,
9
+ "drum_beat": "none",
10
+ "synthesizer": "none",
11
+ "rhythmic_steps": "none",
12
+ "bass_style": "deep bass",
13
+ "guitar_style": "jangle",
14
+ "target_volume": -23.0,
15
+ "preset": "default",
16
+ "max_steps": 1500,
17
+ "bitrate": "192k",
18
+ "output_sample_rate": "48000",
19
+ "bit_depth": "16"
20
+ }
public/stablebeta.py ADDED
@@ -0,0 +1,1151 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import torch
3
+ import torchaudio
4
+ import psutil
5
+ import time
6
+ import sys
7
+ import numpy as np
8
+ import gc
9
+ import gradio as gr
10
+ from pydub import AudioSegment
11
+ import soundfile as sf
12
+ import pyloudnorm as pyln
13
+ from audiocraft.models import MusicGen
14
+ from torch.amp import autocast
15
+ import json
16
+ import configparser
17
+ import random
18
+ import string
19
+ import uvicorn
20
+ from fastapi import FastAPI, HTTPException
21
+ from fastapi.responses import FileResponse
22
+ from pydantic import BaseModel
23
+ import multiprocessing
24
+ import re
25
+ import datetime
26
+ import warnings
27
+
28
+ # ==============================
29
+ # Warnings & Multiprocessing
30
+ # ==============================
31
+ warnings.filterwarnings("ignore", category=UserWarning)
32
+ multiprocessing.set_start_method('spawn', force=True)
33
+
34
+ # ==============================
35
+ # CUDA / PyTorch Runtime Settings
36
+ # ==============================
37
+ os.environ["TORCH_NN_UTILS_LOG_LEVEL"] = "0"
38
+ os.environ["CUDA_LAUNCH_BLOCKING"] = "1"
39
+ os.environ["CUDA_MODULE_LOADING"] = "LAZY"
40
+ os.environ["TORCH_USE_CUDA_DSA"] = "1"
41
+ # Stronger allocator settings to reduce fragmentation and avoid small splits
42
+ os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128,garbage_collection_threshold:0.8,expandable_segments:True"
43
+ # Support a range of architectures (Turing/Ampere/Ada)
44
+ os.environ["TORCH_CUDA_ARCH_LIST"] = "7.5;8.0;8.6;8.9"
45
+
46
+ # Prefer TF32 on Ampere+ (perf) β€” also helps allocator behavior
47
+ try:
48
+ torch.backends.cuda.matmul.allow_tf32 = True
49
+ torch.backends.cudnn.benchmark = True
50
+ except Exception:
51
+ pass
52
+
53
+ # ==============================
54
+ # Version / Device Checks
55
+ # ==============================
56
+ def _parse_version_triplet(s: str):
57
+ m = re.findall(r"\d+", s)
58
+ m = [int(x) for x in m[:3]]
59
+ while len(m) < 3:
60
+ m.append(0)
61
+ return tuple(m)
62
+
63
+ if _parse_version_triplet(torch.__version__) < (2, 0, 0):
64
+ print(f"ERROR: PyTorch {torch.__version__} incompatible. Need >=2.0.0.")
65
+ sys.exit(1)
66
+
67
+ device = "cuda" if torch.cuda.is_available() else "cpu"
68
+ if device != "cuda":
69
+ print("ERROR: CUDA required. CPU disabled.")
70
+ sys.exit(1)
71
+
72
+ cc_major, cc_minor = torch.cuda.get_device_capability(0)
73
+ if cc_major < 7:
74
+ print(f"ERROR: GPU Compute Capability {torch.cuda.get_device_capability(0)} unsupported. Need >=7.0.")
75
+ sys.exit(1)
76
+
77
+ gpu_name = torch.cuda.get_device_name(0)
78
+ print(f"Using GPU: {gpu_name} (CUDA {torch.version.cuda}, Compute Capability {(cc_major, cc_minor)})")
79
+
80
+ # Choose autocast dtype based on hardware support
81
+ try:
82
+ bf16_supported = torch.cuda.is_bf16_supported()
83
+ except Exception:
84
+ bf16_supported = False
85
+ AUTOCAST_DTYPE = torch.bfloat16 if bf16_supported and cc_major >= 8 else torch.float16
86
+
87
+ # ==============================
88
+ # Resource Monitoring
89
+ # ==============================
90
+ def print_resource_usage(stage: str):
91
+ try:
92
+ alloc = torch.cuda.memory_allocated() / (1024 ** 3)
93
+ reserved = torch.cuda.memory_reserved() / (1024 ** 3)
94
+ except Exception:
95
+ alloc, reserved = 0.0, 0.0
96
+ print(f"--- {stage} ---")
97
+ print(f"GPU Memory: {alloc:.2f} GB allocated, {reserved:.2f} GB reserved")
98
+ print(f"CPU: {psutil.cpu_percent()}% | Memory: {psutil.virtual_memory().percent}%")
99
+ print("---------------")
100
+
101
+ # ==============================
102
+ # Output & Metadata
103
+ # ==============================
104
+ output_dir = "mp3"
105
+ os.makedirs(output_dir, exist_ok=True)
106
+ metadata_file = os.path.join(output_dir, "songs_metadata.json")
107
+ api_status = "idle"
108
+
109
+ # ==============================
110
+ # Prompt Variables
111
+ # ==============================
112
+ prompt_variables = {
113
+ 'style': [
114
+ 'epic', 'gritty', 'smooth', 'lush', 'raw', 'intimate', 'driving', 'moody',
115
+ 'psychedelic', 'uplifting', 'melancholic', 'aggressive', 'dreamy', 'retro',
116
+ 'futuristic', 'energetic', 'brooding', 'euphoric', 'jazzy', 'cinematic',
117
+ 'somber', 'triumphant', 'mystical', 'grunge', 'ethereal'
118
+ ],
119
+ 'key': ['C major', 'D major', 'E minor', 'F minor', 'G major', 'A minor', 'B-flat major', 'G minor', 'D minor', 'F major'],
120
+ 'bpm': [80, 90, 100, 110, 120, 124, 128, 130, 140, 150, 160, 170, 180],
121
+ 'time_signature': ['4/4', '3/4', '6/8'],
122
+ 'guitar_style': [
123
+ 'raw distorted', 'melodic', 'fuzzy', 'crisp', 'jangly', 'clean', 'twangy',
124
+ 'shimmering', 'grunge', 'bluesy', 'slide', 'wah-infused', 'chunky'
125
+ ],
126
+ 'bass_style': [
127
+ 'punchy', 'deep', 'groovy', 'melodic', 'throbbing', 'slappy', 'funky',
128
+ 'walking', 'booming', 'resonant', 'subtle'
129
+ ],
130
+ 'drum_style': [
131
+ 'dynamic', 'minimal', 'hard-hitting', 'swinging', 'polyrhythmic', 'brushed',
132
+ 'tight', 'loose', 'electronic', 'acoustic', 'retro', 'punchy'
133
+ ],
134
+ 'drum_feature': [
135
+ 'heavy snare', 'crisp cymbals', 'tight kicks', 'syncopated hits', 'rolling toms',
136
+ 'ghost notes', 'blast beats'
137
+ ],
138
+ 'organ_style': [
139
+ 'subtle Hammond', 'swirling', 'warm Leslie', 'church', 'gritty', 'vintage',
140
+ 'moody'
141
+ ],
142
+ 'synth_style': [
143
+ 'atmospheric', 'bright', 'eerie', 'soaring', 'chopped', 'arpeggiated',
144
+ 'pulsing', 'glitchy', 'analog', 'digital', 'layered'
145
+ ],
146
+ 'vocal_style': [
147
+ 'chopped', 'soulful', 'haunting', 'melodic', 'harmonized', 'layered',
148
+ 'ethereal', 'gruff', 'breathy'
149
+ ],
150
+ 'hihat_style': [
151
+ 'crisp', 'swinging', 'rapid', 'shuffling', 'open', 'tight', 'stuttered'
152
+ ],
153
+ 'pad_style': [
154
+ 'evolving', 'ambient', 'lush', 'dark', 'shimmering', 'warm', 'icy'
155
+ ],
156
+ 'kick_style': [
157
+ 'deep', 'four-on-the-floor', 'subtle', 'punchy', 'booming', 'clicky'
158
+ ],
159
+ 'lead_style': [
160
+ 'fluid', 'intricate', 'soaring', 'expressive', 'virtuosic', 'minimalist',
161
+ 'bluesy', 'lyrical'
162
+ ],
163
+ 'lead_instrument': [
164
+ 'saxophone', 'trumpet', 'guitar', 'flute', 'violin', 'clarinet', 'trombone'
165
+ ],
166
+ 'piano_style': [
167
+ 'expressive Rhodes', 'rapid', 'smooth', 'dramatic', 'stride', 'ambient',
168
+ 'classical', 'jazzy', 'sparse'
169
+ ],
170
+ 'keyboard_style': [
171
+ 'ornate', 'delicate', 'virtuosic', 'minimal', 'retro', 'spacey'
172
+ ],
173
+ 'string_style': [
174
+ 'sweeping', 'delicate', 'dramatic', 'lush', 'pizzicato', 'staccato',
175
+ 'sustained'
176
+ ],
177
+ 'brass_style': [
178
+ 'bold', 'heroic', 'muted', 'fanfare', 'jazzy', 'smooth'
179
+ ],
180
+ 'woodwind_style': [
181
+ 'subtle', 'fluttering', 'melodic', 'airy', 'reedy', 'expressive'
182
+ ],
183
+ 'flute_style': [
184
+ 'fluttering', 'ornate', 'airy', 'breathy', 'trilling'
185
+ ],
186
+ 'horn_style': [
187
+ 'heroic', 'bold', 'soaring', 'mellow', 'stinging'
188
+ ],
189
+ 'choir_style': [
190
+ 'mystical', 'ethereal', 'dramatic', 'angelic', 'epic', 'somber'
191
+ ],
192
+ 'sample_style': [
193
+ 'jazzy', 'soulful', 'gritty', 'cinematic', 'vinyl', 'lo-fi', 'retro'
194
+ ],
195
+ 'scratch_style': [
196
+ 'crackling vinyl', 'sharp', 'rhythmic', 'chopped', 'transform'
197
+ ],
198
+ 'snare_style': [
199
+ 'crisp', 'booming', 'tight', 'snappy', 'rimshot', 'layered'
200
+ ],
201
+ 'breakdown_style': [
202
+ 'euphoric', 'stripped-down', 'intense', 'ambient', 'glitchy', 'dramatic'
203
+ ],
204
+ 'intro_bars': [4, 8, 16],
205
+ 'verse_bars': [8, 16, 32],
206
+ 'chorus_bars': [8, 16],
207
+ 'bridge_bars': [4, 8, 16],
208
+ 'outro_bars': [8, 16],
209
+ 'build_bars': [8, 16, 32],
210
+ 'drop_bars': [16, 32],
211
+ 'main_bars': [16, 32],
212
+ 'breakdown_bars': [8, 16],
213
+ 'head_bars': [16, 32],
214
+ 'solo_bars': [8, 16, 32],
215
+ 'fugue_bars': [16, 32],
216
+ 'coda_bars': [8, 16],
217
+ 'theme_bars': [16, 32],
218
+ 'development_bars': [16, 32],
219
+ 'climax_bars': [8, 16],
220
+ 'groove_bars': [16, 32],
221
+ 'vibe': [
222
+ 'raw', 'energetic', 'melancholic', 'hypnotic', 'soulful', 'intimate',
223
+ 'virtuosic', 'elegant', 'cinematic', 'gritty', 'nostalgic', 'dark',
224
+ 'uplifting', 'bittersweet', 'heroic', 'dreamy', 'aggressive', 'relaxed',
225
+ 'futuristic', 'retro', 'mystical', 'triumphant'
226
+ ],
227
+ 'production_style': [
228
+ 'lo-fi', 'warm analog', 'clean digital', 'lush', 'crisp acoustic',
229
+ 'polished pop', 'grand orchestral', 'grunge', 'minimalist', 'industrial',
230
+ 'vintage'
231
+ ]
232
+ }
233
+
234
+ # ==============================
235
+ # Default INI Creation
236
+ # ==============================
237
+ def create_default_genre_prompts_ini(ini_path):
238
+ default_config = configparser.ConfigParser()
239
+ default_config['Prompts'] = {
240
+ 'nirvana': '{style} grunge with {guitar_style} guitar, {bass_style} bass, {drum_style} drums, {vibe} vibe in {key} at {bpm} BPM',
241
+ 'classic_rock': '{style} classic rock with {guitar_style} guitar, {bass_style} bass, {drum_style} drums, {vibe} vibe in {key} at {bpm} BPM',
242
+ 'detroit_techno': '{style} techno with {synth_style} synths, {kick_style} kick, {hihat_style} hi-hats, {vibe} vibe at {bpm} BPM',
243
+ 'smooth_jazz': '{style} jazz with {piano_style} piano, {bass_style} bass, {drum_style} drums, {vibe} vibe in {key} at {bpm} BPM',
244
+ 'alternative_rock': '{style} alternative rock with {guitar_style} guitar, {bass_style} bass, {drum_style} drums in {key} at {bpm} BPM',
245
+ 'deep_house': '{style} deep house with {synth_style} synths, {kick_style} kick, {vibe} vibe at {bpm} BPM',
246
+ 'bebop_jazz': '{style} bebop jazz with {piano_style} piano, {bass_style} bass, {drum_style} drums in {key} at {bpm} BPM',
247
+ 'baroque_classical': '{style} baroque classical with {string_style} strings, {keyboard_style} harpsichord in {key} at {bpm} BPM',
248
+ 'romantic_classical': '{style} romantic classical with {string_style} strings, {piano_style} piano in {key} at {bpm} BPM',
249
+ 'boom_bap_hiphop': '{style} boom bap hip-hop with {sample_style} samples, {drum_style} drums, {scratch_style} scratches at {bpm} BPM',
250
+ 'trap_hiphop': '{style} trap hip-hop with {synth_style} synths, {kick_style} kick, {snare_style} snare at {bpm} BPM',
251
+ 'pop_rock': '{style} pop rock with {guitar_style} guitar, {bass_style} bass, {drum_style} drums in {key} at {bpm} BPM',
252
+ 'fusion_jazz': '{style} fusion jazz with {piano_style} piano, {guitar_style} guitar, {drum_style} drums in {key} at {bpm} BPM',
253
+ 'edm': '{style} EDM with {synth_style} synths, {kick_style} kick, {vibe} vibe at {bpm} BPM',
254
+ 'indie_folk': '{style} indie folk with {guitar_style} guitar, {vocal_style} vocals, {drum_style} drums in {key} at {bpm} BPM',
255
+ 'star_wars': '{style} epic orchestral with {brass_style} brass, {string_style} strings, {vibe} vibe in {key} at {bpm} BPM',
256
+ 'star_wars_classical': '{style} classical orchestral with {string_style} strings, {horn_style} horns in {key} at {bpm} BPM',
257
+ 'wutang': '{style} hip-hop with {sample_style} samples, {drum_style} drums, {scratch_style} scratches at {bpm} BPM',
258
+ 'milesdavis': '{style} jazz with {lead_instrument} lead, {piano_style} piano, {bass_style} bass in {key} at {bpm} BPM'
259
+ }
260
+ default_config['BandNames'] = {
261
+ 'nirvana': 'Nirvana, Soundgarden',
262
+ 'classic_rock': 'Led Zeppelin, The Rolling Stones',
263
+ 'detroit_techno': 'Underground Resistance, Jeff Mills',
264
+ 'smooth_jazz': 'Pat Metheny, George Benson',
265
+ 'alternative_rock': 'Radiohead, Smashing Pumpkins',
266
+ 'deep_house': 'Moodymann, Theo Parrish',
267
+ 'bebop_jazz': 'Charlie Parker, Dizzy Gillespie',
268
+ 'baroque_classical': 'Bach, Vivaldi',
269
+ 'romantic_classical': 'Chopin, Liszt',
270
+ 'boom_bap_hiphop': 'A Tribe Called Quest, Pete Rock',
271
+ 'trap_hiphop': 'Future, Metro Boomin',
272
+ 'pop_rock': 'Coldplay, The Killers',
273
+ 'fusion_jazz': 'Weather Report, Herbie Hancock',
274
+ 'edm': 'Deadmau5, Skrillex',
275
+ 'indie_folk': 'Fleet Foxes, Bon Iver',
276
+ 'star_wars': 'John Williams',
277
+ 'star_wars_classical': 'John Williams',
278
+ 'wutang': 'Wu-Tang Clan',
279
+ 'milesdavis': 'Miles Davis'
280
+ }
281
+ with open(ini_path, 'w') as f:
282
+ default_config.write(f)
283
+ print(f"Created default {ini_path}")
284
+
285
+ # ==============================
286
+ # CSS Load
287
+ # ==============================
288
+ css_path = "style.css"
289
+ try:
290
+ if not os.path.exists(css_path):
291
+ print(f"ERROR: {css_path} not found. Please create style.css with the required CSS content.")
292
+ sys.exit(1)
293
+ with open(css_path, 'r') as f:
294
+ css = f.read()
295
+ except Exception as e:
296
+ print(f"ERROR: Failed to read {css_path}: {e}. Please ensure style.css exists and is readable.")
297
+ sys.exit(1)
298
+
299
+ # ==============================
300
+ # INI Load
301
+ # ==============================
302
+ config = configparser.ConfigParser()
303
+ ini_path = "genre_prompts.ini"
304
+ try:
305
+ if not os.path.exists(ini_path):
306
+ print(f"WARNING: {ini_path} not found. Creating default INI file.")
307
+ create_default_genre_prompts_ini(ini_path)
308
+ config.read(ini_path)
309
+ if 'Prompts' not in config.sections() or 'BandNames' not in config.sections():
310
+ print(f"WARNING: Invalid {ini_path}. Creating default INI file.")
311
+ create_default_genre_prompts_ini(ini_path)
312
+ config.read(ini_path)
313
+ except Exception as e:
314
+ print(f"ERROR: Failed to read {ini_path}: {e}. Creating default INI file.")
315
+ create_default_genre_prompts_ini(ini_path)
316
+ config.read(ini_path)
317
+
318
+ # ==============================
319
+ # Model Load with Fallback
320
+ # ==============================
321
+ def load_musicgen_with_fallback():
322
+ model_paths = [
323
+ os.getenv("MUSICGEN_MODEL_PATH_LARGE", "/home/ubuntu/musicpack/models/musicgen-large"),
324
+ os.getenv("MUSICGEN_MODEL_PATH_MEDIUM", "/home/ubuntu/musicpack/models/musicgen-medium"),
325
+ os.getenv("MUSICGEN_MODEL_PATH_SMALL", "/home/ubuntu/musicpack/models/musicgen-small"),
326
+ ]
327
+ model_names = ["large", "medium", "small"]
328
+
329
+ last_error = None
330
+ for path, name in zip(model_paths, model_names):
331
+ if not path:
332
+ continue
333
+ if not os.path.exists(path):
334
+ print(f"NOTE: Model path not found: {path} (skipping {name})")
335
+ continue
336
+ try:
337
+ print(f"Loading MusicGen {name} model from {path} ...")
338
+ torch.cuda.empty_cache()
339
+ gc.collect()
340
+ with autocast('cuda', dtype=AUTOCAST_DTYPE):
341
+ mdl = MusicGen.get_pretrained(path, device=device)
342
+ print(f"Loaded MusicGen {name}. Sample rate: {mdl.sample_rate}Hz")
343
+ return mdl, name
344
+ except RuntimeError as e:
345
+ last_error = e
346
+ print(f"WARNING: Failed to load {name} model due to: {e}")
347
+ torch.cuda.empty_cache()
348
+ gc.collect()
349
+ continue
350
+ except Exception as e:
351
+ last_error = e
352
+ print(f"WARNING: Failed to load {name} model due to: {e}")
353
+ torch.cuda.empty_cache()
354
+ gc.collect()
355
+ continue
356
+ if last_error:
357
+ print(f"ERROR: All model loads failed. Last error: {last_error}")
358
+ raise SystemExit(1)
359
+
360
+ try:
361
+ musicgen_model, loaded_model_name = load_musicgen_with_fallback()
362
+ # Conservative defaults; can be overridden per-call
363
+ musicgen_model.set_generation_params(
364
+ duration=10,
365
+ use_sampling=True,
366
+ top_k=50,
367
+ top_p=0.0,
368
+ temperature=0.8,
369
+ cfg_coef=3.0,
370
+ two_step_cfg=False
371
+ )
372
+ sample_rate = musicgen_model.sample_rate
373
+ print(f"Model active: {loaded_model_name}. Sample rate: {sample_rate}Hz")
374
+ except SystemExit:
375
+ sys.exit(1)
376
+
377
+ # ==============================
378
+ # Audio Processing Helpers
379
+ # ==============================
380
+ def apply_eq(segment):
381
+ segment = segment.high_pass_filter(60)
382
+ segment = segment.low_pass_filter(12000)
383
+ segment = segment - 2.0
384
+ return segment
385
+
386
+ def apply_limiter(segment, max_db=-6.0, target_lufs=-16.0):
387
+ samples = np.array(segment.get_array_of_samples(), dtype=np.float32) / (2**15)
388
+ if segment.channels == 2:
389
+ samples = samples.reshape(-1, 2)
390
+ meter = pyln.Meter(segment.frame_rate)
391
+ loudness = meter.integrated_loudness(samples)
392
+ normalized_samples = pyln.normalize.loudness(samples, loudness, target_lufs)
393
+ if np.max(np.abs(normalized_samples)) > (10 ** (max_db / 20)):
394
+ normalized_samples *= (10 ** (max_db / 20)) / np.max(np.abs(normalized_samples))
395
+ normalized_samples = (normalized_samples * (2**15)).astype(np.int16)
396
+ segment = AudioSegment(
397
+ normalized_samples.tobytes(),
398
+ frame_rate=segment.frame_rate,
399
+ sample_width=2,
400
+ channels=segment.channels
401
+ )
402
+ del samples, normalized_samples
403
+ gc.collect()
404
+ return segment
405
+
406
+ def apply_fade(segment, fade_in_duration=1000, fade_out_duration=1000):
407
+ segment = segment.fade_in(fade_in_duration)
408
+ segment = segment.fade_out(fade_out_duration)
409
+ return segment
410
+
411
+ # ==============================
412
+ # Names & Metadata
413
+ # ==============================
414
+ made_up_names = [
415
+ 'blazepulse', 'shadowrift', 'neonquest', 'thunderclash', 'stargroove',
416
+ 'mysticvibe', 'ironspark', 'ghostsurge', 'velvetstorm', 'crimsonrush',
417
+ 'duskblitz', 'solarflame', 'nightdrift', 'frostsaga', 'emberwave',
418
+ 'coolriff', 'wildpulse', 'echoslash', 'moontide', 'skydive'
419
+ ]
420
+
421
+ def extract_song_keyword(prompt):
422
+ if not prompt:
423
+ return random.choice(made_up_names)
424
+ words = re.findall(r'\b\w+\b', prompt.lower())
425
+ for word in words:
426
+ if len(word) <= 15 and word.isalnum():
427
+ return word
428
+ return random.choice(made_up_names)
429
+
430
+ def generate_unique_title(existing_titles, genre, song_keyword, style):
431
+ letters = string.ascii_uppercase
432
+ numbers = string.digits
433
+ max_attempts = 100
434
+ attempt = 0
435
+ while attempt < max_attempts:
436
+ title_base = f"{random.choice(letters)}{random.choice(numbers)}"
437
+ band_names = config['BandNames'].get(genre, "nirvana").split(',')
438
+ band_name = random.choice([name.strip() for name in band_names])
439
+ existing_count = sum(1 for t in existing_titles if t.startswith(title_base) and song_keyword in t and style in t and band_name in t)
440
+ if existing_count == 0:
441
+ return title_base, band_name
442
+ suffix = f"{random.choice(letters)}{random.choice(numbers)}".lower()
443
+ title_base = f"{title_base}_{suffix}"
444
+ attempt += 1
445
+ raise ValueError("Failed to generate unique title after maximum attempts")
446
+
447
+ def update_metadata_storage(metadata):
448
+ try:
449
+ songs_metadata = []
450
+ if os.path.exists(metadata_file):
451
+ with open(metadata_file, 'r') as f:
452
+ songs_metadata = json.load(f)
453
+ songs_metadata.append({
454
+ "title": metadata["title"],
455
+ "filename": metadata["filename"],
456
+ "prompt": metadata.get("prompt", ""),
457
+ "duration": metadata.get("duration", 30),
458
+ "volume_db": metadata.get("volume_db", -24.0),
459
+ "target_lufs": metadata.get("target_lufs", -16.0),
460
+ "timestamp": metadata.get("timestamp", datetime.datetime.now().strftime("%Y%m%d_%H%M%S")),
461
+ "file_path": metadata.get("file_path", ""),
462
+ "sample_rate": metadata.get("sample_rate", musicgen_model.sample_rate),
463
+ "style": metadata.get("style", ""),
464
+ "band_name": metadata.get("band_name", ""),
465
+ "chunk_index": metadata.get("chunk_index", 0)
466
+ })
467
+ with open(metadata_file, 'w') as f:
468
+ json.dump(songs_metadata, f, indent=4)
469
+ except Exception as e:
470
+ print(f"ERROR: Failed to update metadata storage: {e}")
471
+
472
+ def load_renders():
473
+ if not os.path.exists(metadata_file):
474
+ return [], "No renders found."
475
+ try:
476
+ with open(metadata_file, 'r') as f:
477
+ songs_metadata = json.load(f)
478
+ renders = [
479
+ {
480
+ "Title": entry["title"],
481
+ "Filename": entry["filename"],
482
+ "Prompt": entry["prompt"],
483
+ "Duration (s)": entry["duration"],
484
+ "Timestamp": entry["timestamp"],
485
+ "Audio": entry["file_path"],
486
+ "Download": f'<a href="/get-song/{entry["filename"]}" download><button class="download-btn" aria-label="Download {entry["title"]}">⬇️</button></a>',
487
+ "Chunk": entry["chunk_index"]
488
+ }
489
+ for entry in songs_metadata
490
+ ]
491
+ return renders, "Renders loaded successfully."
492
+ except Exception as e:
493
+ return [], f"Error loading renders: {e}"
494
+
495
+ # ==============================
496
+ # Prompt Builder
497
+ # ==============================
498
+ def get_genre_prompt(genre):
499
+ base_prompt = config['Prompts'].get(genre, "")
500
+ if not base_prompt:
501
+ base_prompt = "{style} grunge with {guitar_style} guitar, {bass_style} bass, {drum_style} drums, {vibe} vibe in {key} at {bpm} BPM"
502
+ prompt_dict = {
503
+ 'style': random.choice(prompt_variables['style']),
504
+ 'key': random.choice(prompt_variables['key']),
505
+ 'bpm': random.choice(prompt_variables['bpm']),
506
+ 'time_signature': random.choice(prompt_variables['time_signature']),
507
+ 'guitar_style': random.choice(prompt_variables['guitar_style']),
508
+ 'bass_style': random.choice(prompt_variables['bass_style']),
509
+ 'drum_style': random.choice(prompt_variables['drum_style']),
510
+ 'drum_feature': random.choice(prompt_variables['drum_feature']),
511
+ 'organ_style': random.choice(prompt_variables['organ_style']),
512
+ 'synth_style': random.choice(prompt_variables['synth_style']),
513
+ 'vocal_style': random.choice(prompt_variables['vocal_style']),
514
+ 'hihat_style': random.choice(prompt_variables['hihat_style']),
515
+ 'pad_style': random.choice(prompt_variables['pad_style']),
516
+ 'kick_style': random.choice(prompt_variables['kick_style']),
517
+ 'lead_style': random.choice(prompt_variables['lead_style']),
518
+ 'lead_instrument': random.choice(prompt_variables['lead_instrument']),
519
+ 'piano_style': random.choice(prompt_variables['piano_style']),
520
+ 'keyboard_style': random.choice(prompt_variables['keyboard_style']),
521
+ 'string_style': random.choice(prompt_variables['string_style']),
522
+ 'brass_style': random.choice(prompt_variables['brass_style']),
523
+ 'woodwind_style': random.choice(prompt_variables['woodwind_style']),
524
+ 'flute_style': random.choice(prompt_variables['flute_style']),
525
+ 'horn_style': random.choice(prompt_variables['horn_style']),
526
+ 'choir_style': random.choice(prompt_variables['choir_style']),
527
+ 'sample_style': random.choice(prompt_variables['sample_style']),
528
+ 'scratch_style': random.choice(prompt_variables['scratch_style']),
529
+ 'snare_style': random.choice(prompt_variables['snare_style']),
530
+ 'breakdown_style': random.choice(prompt_variables['breakdown_style']),
531
+ 'intro_bars': random.choice(prompt_variables['intro_bars']),
532
+ 'verse_bars': random.choice(prompt_variables['verse_bars']),
533
+ 'chorus_bars': random.choice(prompt_variables['chorus_bars']),
534
+ 'bridge_bars': random.choice(prompt_variables['bridge_bars']),
535
+ 'outro_bars': random.choice(prompt_variables['outro_bars']),
536
+ 'build_bars': random.choice(prompt_variables['build_bars']),
537
+ 'drop_bars': random.choice(prompt_variables['drop_bars']),
538
+ 'main_bars': random.choice(prompt_variables['main_bars']),
539
+ 'breakdown_bars': random.choice(prompt_variables['breakdown_bars']),
540
+ 'head_bars': random.choice(prompt_variables['head_bars']),
541
+ 'solo_bars': random.choice(prompt_variables['solo_bars']),
542
+ 'fugue_bars': random.choice(prompt_variables['fugue_bars']),
543
+ 'coda_bars': random.choice(prompt_variables['coda_bars']),
544
+ 'theme_bars': random.choice(prompt_variables['theme_bars']),
545
+ 'development_bars': random.choice(prompt_variables['development_bars']),
546
+ 'climax_bars': random.choice(prompt_variables['climax_bars']),
547
+ 'groove_bars': random.choice(prompt_variables['groove_bars']),
548
+ 'vibe': random.choice(prompt_variables['vibe']),
549
+ 'production_style': random.choice(prompt_variables['production_style'])
550
+ }
551
+ try:
552
+ formatted_prompt = base_prompt.format(**prompt_dict)
553
+ words = re.findall(r'\b\w+\b', formatted_prompt.lower())
554
+ val_list = []
555
+ for k, v in prompt_variables.items():
556
+ if isinstance(v, list):
557
+ val_list.extend(v)
558
+ if not any(word in val_list for word in words):
559
+ formatted_prompt = f"{prompt_dict['style']} music with {prompt_dict['guitar_style']} guitar, {prompt_dict['bass_style']} bass, {prompt_dict['drum_style']} drums in {prompt_dict['key']} at {prompt_dict['bpm']} BPM"
560
+ except KeyError:
561
+ formatted_prompt = f"{prompt_dict['style']} music with {prompt_dict['guitar_style']} guitar, {prompt_dict['bass_style']} bass, {prompt_dict['drum_style']} drums in {prompt_dict['key']} at {prompt_dict['bpm']} BPM"
562
+ return formatted_prompt, prompt_dict['style']
563
+
564
+ # ==============================
565
+ # Adaptive Chunk Generation (OOM-safe)
566
+ # ==============================
567
+ def generate_chunk_oom_safe(model, text_prompt, continuation_prompt, cfg_scale, top_k, top_p, temperature, target_duration):
568
+ durations_to_try = [target_duration, 20, 15, 12, 10, 8, 6, 4, 3, 2]
569
+ for dur in durations_to_try:
570
+ try:
571
+ torch.cuda.synchronize()
572
+ torch.cuda.empty_cache()
573
+ model.set_generation_params(
574
+ duration=dur,
575
+ use_sampling=True,
576
+ top_k=int(top_k),
577
+ top_p=float(top_p),
578
+ temperature=float(temperature),
579
+ cfg_coef=float(cfg_scale),
580
+ two_step_cfg=False
581
+ )
582
+ with torch.no_grad():
583
+ with autocast('cuda', dtype=AUTOCAST_DTYPE):
584
+ if continuation_prompt is None:
585
+ # progress=False lowers overhead
586
+ audio_chunk = model.generate([text_prompt], progress=False)[0]
587
+ else:
588
+ audio_chunk = model.generate_continuation(
589
+ continuation_prompt, model.sample_rate, [text_prompt], progress=False
590
+ )[0]
591
+ return audio_chunk, dur
592
+ except RuntimeError as e:
593
+ msg = str(e).lower()
594
+ if "out of memory" in msg or "cuda error" in msg:
595
+ print(f"OOM at duration {dur}s β€” retrying with smaller chunk...")
596
+ torch.cuda.empty_cache()
597
+ gc.collect()
598
+ continue
599
+ else:
600
+ raise
601
+ raise RuntimeError("Failed to generate audio chunk without CUDA OOM.")
602
+
603
+ # ==============================
604
+ # Generation
605
+ # ==============================
606
+ def generate_music(instrumental_prompt: str, cfg_scale: float, top_k: int, top_p: float, temperature: float, total_duration: int, volume_db: float, genre: str = None):
607
+ global musicgen_model
608
+ global api_status
609
+ api_status = "rendering"
610
+
611
+ if not instrumental_prompt.strip() and not genre:
612
+ instrumental_prompt, style = get_genre_prompt("nirvana")
613
+ elif not instrumental_prompt.strip():
614
+ instrumental_prompt, style = get_genre_prompt(genre)
615
+ else:
616
+ words = re.findall(r'\b\w+\b', instrumental_prompt.lower())
617
+ val_list = []
618
+ for k, v in prompt_variables.items():
619
+ if isinstance(v, list):
620
+ val_list.extend(v)
621
+ if not any(word in val_list for word in words):
622
+ instrumental_prompt, style = get_genre_prompt("nirvana")
623
+ else:
624
+ ek = extract_song_keyword(instrumental_prompt)
625
+ style = ek if ek in prompt_variables['style'] else random.choice(prompt_variables['style'])
626
+
627
+ try:
628
+ start_time = time.time()
629
+ base_chunk_target = 30 # target; adaptive OOM-safe will shrink if needed
630
+ total_duration = max(total_duration, 30)
631
+ remaining = total_duration
632
+ audio_chunks = []
633
+ chunk_paths = []
634
+ continuation_prompt = None
635
+ chunk_index = 0
636
+
637
+ # Titles
638
+ existing_titles = []
639
+ if os.path.exists(metadata_file):
640
+ with open(metadata_file, 'r') as f:
641
+ songs_metadata = json.load(f)
642
+ existing_titles = [entry["title"] for entry in songs_metadata]
643
+ song_keyword = extract_song_keyword(instrumental_prompt)
644
+ title_base, band_name = generate_unique_title(existing_titles, genre if genre else "nirvana", song_keyword, style)
645
+
646
+ # Loop until we render total_duration seconds with adaptive chunks
647
+ while remaining > 0:
648
+ target = min(base_chunk_target, remaining)
649
+ print_resource_usage(f"Before Chunk {chunk_index + 1}")
650
+ try:
651
+ audio_chunk, actual_dur = generate_chunk_oom_safe(
652
+ musicgen_model, instrumental_prompt, continuation_prompt, cfg_scale, top_k, top_p, temperature, target
653
+ )
654
+ audio_chunk = audio_chunk.cpu().to(dtype=torch.float32)
655
+ if audio_chunk.dim() == 1:
656
+ audio_chunk = torch.stack([audio_chunk, audio_chunk], dim=0)
657
+ elif audio_chunk.dim() == 2 and audio_chunk.shape[0] == 1:
658
+ audio_chunk = torch.cat([audio_chunk, audio_chunk], dim=0)
659
+ elif audio_chunk.dim() == 2 and audio_chunk.shape[0] != 2:
660
+ audio_chunk = audio_chunk[:1, :]
661
+ audio_chunk = torch.cat([audio_chunk, audio_chunk], dim=0)
662
+ elif audio_chunk.dim() > 2:
663
+ audio_chunk = audio_chunk.view(2, -1)
664
+ if audio_chunk.shape[0] != 2:
665
+ raise ValueError(f"Expected stereo audio with shape (2, samples), got {audio_chunk.shape}")
666
+
667
+ # Update continuation prompt (use up to last 2 seconds if available)
668
+ samples_per_second = musicgen_model.sample_rate
669
+ tail_sec = 2
670
+ tail_samples = min(int(tail_sec * samples_per_second), audio_chunk.shape[1] - 1 if audio_chunk.shape[1] > 1 else 1)
671
+ if tail_samples > 0:
672
+ continuation_prompt = audio_chunk[:, -tail_samples:].cpu()
673
+ else:
674
+ continuation_prompt = None
675
+
676
+ # Save to temp wav and convert
677
+ temp_wav_path = os.path.join(output_dir, f"temp_{random.randint(100, 999)}_{chunk_index}.wav")
678
+ try:
679
+ torchaudio.save(temp_wav_path, audio_chunk, musicgen_model.sample_rate, bits_per_sample=16)
680
+ final_segment = AudioSegment.from_wav(temp_wav_path)
681
+ finally:
682
+ if os.path.exists(temp_wav_path):
683
+ os.remove(temp_wav_path)
684
+ del audio_chunk
685
+ gc.collect()
686
+
687
+ # Post FX
688
+ print(f"Post-processing chunk {chunk_index + 1} (duration ~{actual_dur}s)...")
689
+ final_segment = apply_eq(final_segment)
690
+ final_segment = apply_limiter(final_segment, max_db=volume_db, target_lufs=-16.0)
691
+ if chunk_index == 0:
692
+ final_segment = final_segment.fade_in(1000)
693
+ # if last chunk, fade out will be added after loop when combining; also safe to fade here if remaining-actual_dur==0
694
+ if remaining - actual_dur <= 0:
695
+ final_segment = final_segment.fade_out(1000)
696
+
697
+ # Export
698
+ mp3_filename = f"{title_base.lower()}_{song_keyword}_{style}_{band_name}_chunk{chunk_index + 1}.mp3"
699
+ mp3_path = os.path.join(output_dir, mp3_filename)
700
+ final_segment.export(
701
+ mp3_path,
702
+ format="mp3",
703
+ bitrate="64k",
704
+ tags={"title": f"{title_base}_Chunk{chunk_index + 1}", "artist": "GhostAI"}
705
+ )
706
+ print(f"Saved chunk {chunk_index + 1} to {mp3_path}")
707
+ audio_chunks.append(final_segment)
708
+ chunk_paths.append(mp3_path)
709
+
710
+ # Metadata
711
+ metadata = {
712
+ "title": f"{title_base}_Chunk{chunk_index + 1}",
713
+ "filename": mp3_filename,
714
+ "prompt": instrumental_prompt,
715
+ "duration": actual_dur,
716
+ "volume_db": volume_db,
717
+ "target_lufs": -16.0,
718
+ "timestamp": datetime.datetime.now().strftime("%Y%m%d_%H%M%S"),
719
+ "file_path": mp3_path,
720
+ "sample_rate": musicgen_model.sample_rate,
721
+ "style": style,
722
+ "band_name": band_name,
723
+ "chunk_index": chunk_index + 1
724
+ }
725
+ update_metadata_storage(metadata)
726
+
727
+ chunk_index += 1
728
+ remaining -= actual_dur
729
+ torch.cuda.empty_cache()
730
+ gc.collect()
731
+ print_resource_usage(f"After Chunk {chunk_index}")
732
+ except Exception as e:
733
+ print(f"ERROR: Failed to process chunk {chunk_index + 1}: {e}")
734
+ api_status = "idle"
735
+ raise
736
+
737
+ # Combine chunks if more than one
738
+ if len(audio_chunks) > 1:
739
+ combined_segment = audio_chunks[0]
740
+ for segment in audio_chunks[1:]:
741
+ combined_segment = combined_segment.append(segment, crossfade=500)
742
+ combined_mp3_filename = f"{title_base.lower()}_{song_keyword}_{style}_{band_name}_combined.mp3"
743
+ combined_mp3_path = os.path.join(output_dir, combined_mp3_filename)
744
+ combined_segment.export(
745
+ combined_mp3_path,
746
+ format="mp3",
747
+ bitrate="64k",
748
+ tags={"title": title_base, "artist": "GhostAI"}
749
+ )
750
+ print(f"Saved combined audio to {combined_mp3_path}")
751
+ metadata = {
752
+ "title": title_base,
753
+ "filename": combined_mp3_filename,
754
+ "prompt": instrumental_prompt,
755
+ "duration": total_duration,
756
+ "volume_db": volume_db,
757
+ "target_lufs": -16.0,
758
+ "timestamp": datetime.datetime.now().strftime("%Y%m%d_%H%M%S"),
759
+ "file_path": combined_mp3_path,
760
+ "sample_rate": musicgen_model.sample_rate,
761
+ "style": style,
762
+ "band_name": band_name,
763
+ "chunk_index": 0
764
+ }
765
+ update_metadata_storage(metadata)
766
+ del combined_segment, audio_chunks
767
+ gc.collect()
768
+ api_status = "idle"
769
+ return combined_mp3_path, "βœ… Done!", False, gr.update(value=load_renders()[0])
770
+ else:
771
+ # Single chunk only
772
+ print(f"Saved metadata to {metadata_file}")
773
+ del audio_chunks
774
+ gc.collect()
775
+ api_status = "idle"
776
+ return chunk_paths[0], "βœ… Done!", False, gr.update(value=load_renders()[0])
777
+
778
+ except Exception as e:
779
+ print(f"❌ Failed: {e}")
780
+ api_status = "idle"
781
+ return None, f"❌ Failed: {e}", False, gr.update(value=load_renders()[0])
782
+ finally:
783
+ torch.cuda.synchronize()
784
+ torch.cuda.empty_cache()
785
+ gc.collect()
786
+
787
+ def clear_inputs():
788
+ return "", 3.0, 50, 0.0, 0.8, 30, -24.0, False
789
+
790
+ def show_render_wheel():
791
+ return True
792
+
793
+ def set_genre_prompt(genre: str):
794
+ prompt, _ = get_genre_prompt(genre)
795
+ return prompt
796
+
797
+ # ==============================
798
+ # Gradio UI
799
+ # ==============================
800
+ with gr.Blocks(css=css) as demo:
801
+ gr.Markdown("""
802
+ <div class="header-container" role="banner" aria-label="GhostAI Music Generator">
803
+ <h1>GhostAI Music Generator</h1>
804
+ <p>Create Professional Instrumental Tracks</p>
805
+ </div>
806
+ """)
807
+ with gr.Tabs():
808
+ with gr.Tab("Generate", id="generate"):
809
+ with gr.Column(elem_classes="input-container"):
810
+ gr.Markdown("### Instrumental Prompt")
811
+ instrumental_prompt = gr.Textbox(
812
+ label="Instrumental Prompt",
813
+ placeholder="Select a genre or enter a custom prompt (e.g., 'coolriff grunge')",
814
+ lines=4,
815
+ elem_classes="textbox"
816
+ )
817
+ with gr.Row(elem_classes="genre-buttons"):
818
+ classic_rock_btn = gr.Button("Classic Rock", elem_classes="genre-btn")
819
+ alternative_rock_btn = gr.Button("Alternative Rock", elem_classes="genre-btn")
820
+ detroit_techno_btn = gr.Button("Detroit Techno", elem_classes="genre-btn")
821
+ deep_house_btn = gr.Button("Deep House", elem_classes="genre-btn")
822
+ smooth_jazz_btn = gr.Button("Smooth Jazz", elem_classes="genre-btn")
823
+ bebop_jazz_btn = gr.Button("Bebop Jazz", elem_classes="genre-btn")
824
+ baroque_classical_btn = gr.Button("Baroque Classical", elem_classes="genre-btn")
825
+ romantic_classical_btn = gr.Button("Romantic Classical", elem_classes="genre-btn")
826
+ boom_bap_hiphop_btn = gr.Button("Boom Bap Hip-Hop", elem_classes="genre-btn")
827
+ trap_hiphop_btn = gr.Button("Trap Hip-Hop", elem_classes="genre-btn")
828
+ pop_rock_btn = gr.Button("Pop Rock", elem_classes="genre-btn")
829
+ fusion_jazz_btn = gr.Button("Fusion Jazz", elem_classes="genre-btn")
830
+ edm_btn = gr.Button("EDM", elem_classes="genre-btn")
831
+ indie_folk_btn = gr.Button("Indie Folk", elem_classes="genre-btn")
832
+ star_wars_btn = gr.Button("Star Wars Epic", elem_classes="genre-btn")
833
+ star_wars_classical_btn = gr.Button("Star Wars Classical", elem_classes="genre-btn")
834
+ nirvana_btn = gr.Button("Nirvana", elem_classes="genre-btn")
835
+ wutang_btn = gr.Button("Wu-Tang", elem_classes="genre-btn")
836
+ milesdavis_btn = gr.Button("Miles Davis", elem_classes="genre-btn")
837
+ with gr.Column(elem_classes="settings-container"):
838
+ gr.Markdown("### Generation Settings")
839
+ cfg_scale = gr.Slider(
840
+ label="Guidance Scale (CFG)",
841
+ minimum=1.0,
842
+ maximum=10.0,
843
+ value=3.0,
844
+ step=0.1
845
+ )
846
+ top_k = gr.Slider(
847
+ label="Top-K Sampling",
848
+ minimum=10,
849
+ maximum=500,
850
+ value=50,
851
+ step=10
852
+ )
853
+ top_p = gr.Slider(
854
+ label="Top-P Sampling",
855
+ minimum=0.0,
856
+ maximum=1.0,
857
+ value=0.0,
858
+ step=0.1
859
+ )
860
+ temperature = gr.Slider(
861
+ label="Temperature",
862
+ minimum=0.1,
863
+ maximum=2.0,
864
+ value=0.8,
865
+ step=0.1
866
+ )
867
+ total_duration = gr.Slider(
868
+ label="Duration (seconds)",
869
+ minimum=30,
870
+ maximum=300,
871
+ value=30,
872
+ step=10
873
+ )
874
+ volume_db = gr.Slider(
875
+ label="Output Volume (dBFS)",
876
+ minimum=-30.0,
877
+ maximum=0.0,
878
+ value=-24.0,
879
+ step=0.1
880
+ )
881
+ with gr.Row(elem_classes="action-buttons"):
882
+ gen_btn = gr.Button("Generate Music")
883
+ clr_btn = gr.Button("Clear Inputs")
884
+ with gr.Column(elem_classes="output-container"):
885
+ gr.Markdown("### Output")
886
+ render_wheel = gr.HTML('<div class="render-wheel" aria-live="polite">Generating...</div>', label="Rendering Status")
887
+ render_state = gr.State(value=False)
888
+ out_audio = gr.Audio(label="Generated Track", type="filepath", interactive=True, elem_classes="audio-container")
889
+ status = gr.Textbox(label="Status", interactive=False)
890
+ with gr.Tab("Renders", id="renders"):
891
+ with gr.Column(elem_classes="renders-container"):
892
+ gr.Markdown("### Browse Renders")
893
+ renders_table = gr.DataFrame(
894
+ headers=["Title", "Filename", "Prompt", "Duration (s)", "Timestamp", "Audio", "Download", "Chunk"],
895
+ datatype=["str", "str", "str", "number", "str", "audio", "html", "number"],
896
+ interactive=False,
897
+ value=load_renders()[0],
898
+ elem_classes="renders-table"
899
+ )
900
+ renders_status = gr.Textbox(label="Renders Status", interactive=False, value=load_renders()[1])
901
+
902
+ # Button bindings
903
+ classic_rock_btn.click(set_genre_prompt, inputs=[gr.State(value="classic_rock")], outputs=[instrumental_prompt])
904
+ alternative_rock_btn.click(set_genre_prompt, inputs=[gr.State(value="alternative_rock")], outputs=[instrumental_prompt])
905
+ detroit_techno_btn.click(set_genre_prompt, inputs=[gr.State(value="detroit_techno")], outputs=[instrumental_prompt])
906
+ deep_house_btn.click(set_genre_prompt, inputs=[gr.State(value="deep_house")], outputs=[instrumental_prompt])
907
+ smooth_jazz_btn.click(set_genre_prompt, inputs=[gr.State(value="smooth_jazz")], outputs=[instrumental_prompt])
908
+ bebop_jazz_btn.click(set_genre_prompt, inputs=[gr.State(value="bebop_jazz")], outputs=[instrumental_prompt])
909
+ baroque_classical_btn.click(set_genre_prompt, inputs=[gr.State(value="baroque_classical")], outputs=[instrumental_prompt])
910
+ romantic_classical_btn.click(set_genre_prompt, inputs=[gr.State(value="romantic_classical")], outputs=[instrumental_prompt])
911
+ boom_bap_hiphop_btn.click(set_genre_prompt, inputs=[gr.State(value="boom_bap_hiphop")], outputs=[instrumental_prompt])
912
+ trap_hiphop_btn.click(set_genre_prompt, inputs=[gr.State(value="trap_hiphop")], outputs=[instrumental_prompt])
913
+ pop_rock_btn.click(set_genre_prompt, inputs=[gr.State(value="pop_rock")], outputs=[instrumental_prompt])
914
+ fusion_jazz_btn.click(set_genre_prompt, inputs=[gr.State(value="fusion_jazz")], outputs=[instrumental_prompt])
915
+ edm_btn.click(set_genre_prompt, inputs=[gr.State(value="edm")], outputs=[instrumental_prompt])
916
+ indie_folk_btn.click(set_genre_prompt, inputs=[gr.State(value="indie_folk")], outputs=[instrumental_prompt])
917
+ star_wars_btn.click(set_genre_prompt, inputs=[gr.State(value="star_wars")], outputs=[instrumental_prompt])
918
+ star_wars_classical_btn.click(set_genre_prompt, inputs=[gr.State(value="star_wars_classical")], outputs=[instrumental_prompt])
919
+ nirvana_btn.click(set_genre_prompt, inputs=[gr.State(value="nirvana")], outputs=[instrumental_prompt])
920
+ wutang_btn.click(set_genre_prompt, inputs=[gr.State(value="wutang")], outputs=[instrumental_prompt])
921
+ milesdavis_btn.click(set_genre_prompt, inputs=[gr.State(value="milesdavis")], outputs=[instrumental_prompt])
922
+ gen_btn.click(
923
+ fn=show_render_wheel,
924
+ inputs=None,
925
+ outputs=[render_state],
926
+ ).then(
927
+ fn=generate_music,
928
+ inputs=[instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, volume_db, gr.State(None)],
929
+ outputs=[out_audio, status, render_state, renders_table],
930
+ show_progress="full"
931
+ )
932
+ clr_btn.click(
933
+ fn=clear_inputs,
934
+ inputs=None,
935
+ outputs=[instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, volume_db, render_state]
936
+ )
937
+
938
+ # ==============================
939
+ # FastAPI
940
+ # ==============================
941
+ app = FastAPI()
942
+
943
+ class MusicRequest(BaseModel):
944
+ prompt: str = None
945
+ duration: int = 30
946
+ volume_db: float = -24.0
947
+ genre: str = None
948
+
949
+ @app.get("/prompts/")
950
+ async def get_prompts():
951
+ global api_status
952
+ try:
953
+ prompts = list(config['Prompts'].keys())
954
+ return {"status": api_status, "prompts": prompts}
955
+ except Exception as e:
956
+ print(f"Error fetching prompts: {e}")
957
+ raise HTTPException(status_code=500, detail=f"Error fetching prompts: {e}")
958
+
959
+ @app.post("/generate-music/")
960
+ async def api_generate_music(request: MusicRequest):
961
+ global api_status
962
+ api_status = "rendering"
963
+ try:
964
+ instrumental_prompt = (
965
+ get_genre_prompt(request.genre)[0] if request.genre else
966
+ request.prompt if request.prompt else
967
+ get_genre_prompt("nirvana")[0]
968
+ )
969
+ style = (
970
+ get_genre_prompt(request.genre)[1] if request.genre else
971
+ extract_song_keyword(request.prompt) if request.prompt and extract_song_keyword(request.prompt) in prompt_variables['style'] else
972
+ get_genre_prompt("nirvana")[1]
973
+ )
974
+ if not instrumental_prompt.strip():
975
+ api_status = "idle"
976
+ raise HTTPException(status_code=400, detail="Invalid prompt or genre")
977
+
978
+ total_duration = max(request.duration, 30)
979
+ remaining = total_duration
980
+ audio_chunks = []
981
+ chunk_paths = []
982
+ continuation_prompt = None
983
+ chunk_index = 0
984
+
985
+ existing_titles = []
986
+ if os.path.exists(metadata_file):
987
+ with open(metadata_file, 'r') as f:
988
+ songs_metadata = json.load(f)
989
+ existing_titles = [entry["title"] for entry in songs_metadata]
990
+ song_keyword = extract_song_keyword(request.prompt if request.prompt else instrumental_prompt)
991
+ title_base, band_name = generate_unique_title(existing_titles, request.genre if request.genre else "nirvana", song_keyword, style)
992
+
993
+ while remaining > 0:
994
+ target = min(30, remaining)
995
+ print_resource_usage(f"Before API Chunk {chunk_index + 1}")
996
+ try:
997
+ audio_chunk, actual_dur = generate_chunk_oom_safe(
998
+ musicgen_model, instrumental_prompt, continuation_prompt, 3.0, 50, 0.0, 0.8, target
999
+ )
1000
+ audio_chunk = audio_chunk.cpu().to(dtype=torch.float32)
1001
+ if audio_chunk.dim() == 1:
1002
+ audio_chunk = torch.stack([audio_chunk, audio_chunk], dim=0)
1003
+ elif audio_chunk.dim() == 2 and audio_chunk.shape[0] == 1:
1004
+ audio_chunk = torch.cat([audio_chunk, audio_chunk], dim=0)
1005
+ elif audio_chunk.dim() == 2 and audio_chunk.shape[0] != 2:
1006
+ audio_chunk = audio_chunk[:1, :]
1007
+ audio_chunk = torch.cat([audio_chunk, audio_chunk], dim=0)
1008
+ elif audio_chunk.dim() > 2:
1009
+ audio_chunk = audio_chunk.view(2, -1)
1010
+ if audio_chunk.shape[0] != 2:
1011
+ raise ValueError(f"Expected stereo audio with shape (2, samples), got {audio_chunk.shape}")
1012
+
1013
+ samples_per_second = musicgen_model.sample_rate
1014
+ tail_sec = 2
1015
+ tail_samples = min(int(tail_sec * samples_per_second), audio_chunk.shape[1] - 1 if audio_chunk.shape[1] > 1 else 1)
1016
+ continuation_prompt = audio_chunk[:, -tail_samples:].cpu() if tail_samples > 0 else None
1017
+
1018
+ temp_wav_path = os.path.join(output_dir, f"temp_{random.randint(100, 999)}_{chunk_index}.wav")
1019
+ try:
1020
+ torchaudio.save(temp_wav_path, audio_chunk, musicgen_model.sample_rate, bits_per_sample=16)
1021
+ final_segment = AudioSegment.from_wav(temp_wav_path)
1022
+ finally:
1023
+ if os.path.exists(temp_wav_path):
1024
+ os.remove(temp_wav_path)
1025
+ del audio_chunk
1026
+ gc.collect()
1027
+
1028
+ final_segment = apply_eq(final_segment)
1029
+ final_segment = apply_limiter(final_segment, max_db=request.volume_db, target_lufs=-16.0)
1030
+ if chunk_index == 0:
1031
+ final_segment = final_segment.fade_in(1000)
1032
+ if remaining - actual_dur <= 0:
1033
+ final_segment = final_segment.fade_out(1000)
1034
+
1035
+ mp3_filename = f"{title_base.lower()}_{song_keyword}_{style}_{band_name}_chunk{chunk_index + 1}.mp3"
1036
+ mp3_path = os.path.join(output_dir, mp3_filename)
1037
+ final_segment.export(
1038
+ mp3_path,
1039
+ format="mp3",
1040
+ bitrate="64k",
1041
+ tags={"title": f"{title_base}_Chunk{chunk_index + 1}", "artist": "GhostAI"}
1042
+ )
1043
+ print(f"Saved API chunk {chunk_index + 1} to {mp3_path}")
1044
+ audio_chunks.append(final_segment)
1045
+ chunk_paths.append(mp3_path)
1046
+
1047
+ metadata = {
1048
+ "title": f"{title_base}_Chunk{chunk_index + 1}",
1049
+ "filename": mp3_filename,
1050
+ "prompt": instrumental_prompt,
1051
+ "duration": actual_dur,
1052
+ "volume_db": request.volume_db,
1053
+ "target_lufs": -16.0,
1054
+ "timestamp": datetime.datetime.now().strftime("%Y%m%d_%H%M%S"),
1055
+ "file_path": mp3_path,
1056
+ "sample_rate": musicgen_model.sample_rate,
1057
+ "style": style,
1058
+ "band_name": band_name,
1059
+ "chunk_index": chunk_index + 1
1060
+ }
1061
+ update_metadata_storage(metadata)
1062
+
1063
+ chunk_index += 1
1064
+ remaining -= actual_dur
1065
+ torch.cuda.empty_cache()
1066
+ gc.collect()
1067
+ print_resource_usage(f"After API Chunk {chunk_index}")
1068
+ except Exception as e:
1069
+ print(f"ERROR: Failed to process API chunk {chunk_index + 1}: {e}")
1070
+ api_status = "idle"
1071
+ raise
1072
+
1073
+ if len(audio_chunks) > 1:
1074
+ combined_segment = audio_chunks[0]
1075
+ for segment in audio_chunks[1:]:
1076
+ combined_segment = combined_segment.append(segment, crossfade=500)
1077
+ combined_mp3_filename = f"{title_base.lower()}_{song_keyword}_{style}_{band_name}_combined.mp3"
1078
+ combined_mp3_path = os.path.join(output_dir, combined_mp3_filename)
1079
+ combined_segment.export(
1080
+ combined_mp3_path,
1081
+ format="mp3",
1082
+ bitrate="64k",
1083
+ tags={"title": title_base, "artist": "GhostAI"}
1084
+ )
1085
+ print(f"Saved combined audio to {combined_mp3_path}")
1086
+ metadata = {
1087
+ "title": title_base,
1088
+ "filename": combined_mp3_filename,
1089
+ "prompt": instrumental_prompt,
1090
+ "duration": total_duration,
1091
+ "volume_db": request.volume_db,
1092
+ "target_lufs": -16.0,
1093
+ "timestamp": datetime.datetime.now().strftime("%Y%m%d_%H%M%S"),
1094
+ "file_path": combined_mp3_path,
1095
+ "sample_rate": musicgen_model.sample_rate,
1096
+ "style": style,
1097
+ "band_name": band_name,
1098
+ "chunk_index": 0
1099
+ }
1100
+ update_metadata_storage(metadata)
1101
+ del combined_segment, audio_chunks
1102
+ gc.collect()
1103
+ api_status = "idle"
1104
+ return FileResponse(combined_mp3_path, media_type="audio/mpeg")
1105
+ else:
1106
+ print(f"Saved metadata to {metadata_file}")
1107
+ del audio_chunks
1108
+ gc.collect()
1109
+ api_status = "idle"
1110
+ return FileResponse(chunk_paths[0], media_type="audio/mpeg")
1111
+ except Exception as e:
1112
+ print(f"Error generating music: {e}")
1113
+ api_status = "idle"
1114
+ raise HTTPException(status_code=500, detail=f"Error generating music: {e}")
1115
+ finally:
1116
+ torch.cuda.synchronize()
1117
+ torch.cuda.empty_cache()
1118
+ gc.collect()
1119
+
1120
+ @app.get("/get-song/{filename}")
1121
+ async def get_song(filename: str):
1122
+ global api_status
1123
+ file_path = os.path.join(output_dir, filename)
1124
+ if not os.path.exists(file_path):
1125
+ print(f"Error: Song file {filename} not found")
1126
+ raise HTTPException(status_code=404, detail="Song file not found")
1127
+ print(f"Serving file: {filename}")
1128
+ return FileResponse(file_path, media_type="audio/mpeg", filename=filename)
1129
+
1130
+ @app.get("/status/")
1131
+ async def get_status():
1132
+ global api_status
1133
+ return {"status": api_status}
1134
+
1135
+ def run_fastapi():
1136
+ uvicorn.run(app, host="0.0.0.0", port=8000)
1137
+
1138
+ # ==============================
1139
+ # Main
1140
+ # ==============================
1141
+ if __name__ == "__main__":
1142
+ fastapi_process = multiprocessing.Process(target=run_fastapi)
1143
+ fastapi_process.start()
1144
+ try:
1145
+ demo.launch(server_name="0.0.0.0", server_port=9999, share=False, inbrowser=True, show_error=True)
1146
+ except Exception as e:
1147
+ print(f"ERROR: Failed to launch Gradio: {e}")
1148
+ fastapi_process.terminate()
1149
+ sys.exit(1)
1150
+ finally:
1151
+ fastapi_process.terminate()
public/styles.css ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /* styles.css */
2
+ /* High-contrast, accessible theme (no inline HTML in Python). ADA-focused: focus rings, large targets, readable contrast. */
3
+
4
+ :root { color-scheme: dark; }
5
+
6
+ * { box-sizing: border-box; }
7
+
8
+ body, .gradio-container {
9
+ background: #0B0B0D !important;
10
+ color: #FFFFFF !important;
11
+ font-family: ui-sans-serif, system-ui, -apple-system, Segoe UI, Roboto, Helvetica, Arial, "Apple Color Emoji","Segoe UI Emoji";
12
+ line-height: 1.4;
13
+ }
14
+
15
+ h1, h2, h3, h4, h5, h6, label, p, span {
16
+ color: #FFFFFF !important;
17
+ }
18
+
19
+ .block, .panel, .wrap, .tabs, .tabitem, .form, .group {
20
+ background: #0B0B0D !important;
21
+ }
22
+
23
+ input, textarea, select {
24
+ background: #15151A !important;
25
+ color: #FFFFFF !important;
26
+ border: 1px solid #2B2B33 !important;
27
+ border-radius: 10px !important;
28
+ padding: 10px 12px !important;
29
+ }
30
+
31
+ button {
32
+ background: #1F6FEB !important;
33
+ color: #FFFFFF !important;
34
+ border: 2px solid transparent !important;
35
+ border-radius: 10px !important;
36
+ padding: 12px 14px !important;
37
+ font-weight: 700 !important;
38
+ min-height: 44px; /* touch target */
39
+ }
40
+
41
+ button:hover { background: #2D7BFF !important; }
42
+ button:focus { outline: 3px solid #00C853 !important; outline-offset: 2px; }
43
+
44
+ .group > * + * { margin-top: 8px; }
45
+ .row { gap: 8px; }
46
+
47
+ .audio-wrap, .audio-display, .output-html {
48
+ border: 1px solid #2B2B33 !important;
49
+ border-radius: 10px !important;
50
+ }
51
+
52
+ .slider > input[type="range"] { accent-color: #FFD600 !important; }
53
+
54
+ /* Large labels for readability */
55
+ label { font-size: 16px !important; font-weight: 700 !important; }
56
+
57
+ /* Subtle card border around control groups */
58
+ .group { border: 1px solid #2B2B33; border-radius: 12px; padding: 12px; }