ghostai1 commited on
Commit
1f59f51
·
verified ·
1 Parent(s): acf4dec

Upload 8 files

Browse files
public/2requirements.txt ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ torch==2.1.0
2
+ torchaudio==2.1.0
3
+ audiocraft==0.0.1
4
+ gradio==4.44.0
5
+ gradio_client==1.3.0
6
+ fastapi==0.115.0
7
+ uvicorn==0.30.6
8
+ pydub==0.25.1
9
+ colorama==0.4.6
10
+ numpy==1.26.4
public/example.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🎵 GhostAI Music Generator — Examples & Links
2
+
3
+ **Release:** v0.9.5-rc2
4
+
5
+ ## Quick Start
6
+ - Outputs save to **`./mp3/`**
7
+ - Logs live at **`./logs/ghostai.log`** (single file, auto-truncates at 5 MB)
8
+ - API server auto-starts at **`http://0.0.0.0:8555`**
9
+ - UI runs at **`http://0.0.0.0:9999`**
10
+
11
+ ## Endpoints
12
+ - Health: `GET /health`
13
+ - Status: `GET /status`
14
+ - Config: `GET /config`
15
+ - Render: `POST /render`
16
+ - Style prompts (examples):
17
+ - Metallica: `GET /set_classic_rock_prompt`
18
+ - Nirvana: `GET /set_nirvana_grunge_prompt`
19
+ - Pearl Jam: `GET /set_pearl_jam_grunge_prompt`
20
+ - Soundgarden: `GET /set_soundgarden_grunge_prompt`
21
+ - Foo Fighters: `GET /set_foo_fighters_prompt`
22
+ - RHCP: `GET /set_rhcp_prompt`
23
+ - Smashing Pumpkins: `GET /set_smashing_pumpkins_prompt`
24
+ - Radiohead: `GET /set_radiohead_prompt`
25
+ - Alt Rock: `GET /set_alternative_rock_prompt`
26
+ - Post-Punk: `GET /set_post_punk_prompt`
27
+ - Indie Rock: `GET /set_indie_rock_prompt`
28
+ - Funk Rock: `GET /set_funk_rock_prompt`
29
+ - Detroit Techno: `GET /set_detroit_techno_prompt`
30
+ - Deep House: `GET /set_deep_house_prompt`
31
+ - **Star Opera (Cinematic / “Star-Wars-style”)**: `GET /set_classical_star_opera_prompt`
32
+
33
+ > Call a style endpoint to get a **fresh, varied prompt** for that genre.
34
+ > Then pass it into `/render` along with any overrides (BPM, duration, etc).
35
+
36
+ ## Links
37
+ - Main HF repo: https://huggingface.co/ghostai1/GHOSTSONAFB
38
+ - README: https://huggingface.co/ghostai1/GHOSTSONAFB/blob/main/README.md
39
+ - MusicGen Large: https://huggingface.co/facebook/musicgen-large
40
+ - MusicGen Medium: https://huggingface.co/facebook/musicgen-medium
41
+
42
+ ## Notes
43
+ - Built for 30xx GPUs with **12 GB+ VRAM** (SM80).
44
+ - **Chunking**: renders in 30-second segments with seamless crossfades.
45
+ - **DSP chain**: gate → stereo balance → RMS normalize (-23dBFS) → EQ → fades → final SR.
46
+ - **Bitrates/SR/Bit depth** quick-buttons in UI.
public/example_page.md CHANGED
@@ -1,25 +1,22 @@
1
- <!-- docs/example_page.md -->
2
- # GhostAI Music Generator — Quick Links
3
-
4
-
5
- <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/6421b1c68adc8881b974a89d/h9EL4d5bvjKFnC0v39386.mpga"></audio>
6
-
7
- - **MusicGen Large (Meta):** https://huggingface.co/facebook/musicgen-large
8
- - **GhostAI assets & scripts:**
9
- - Repo hub: https://huggingface.co/ghostai1/GHOSTSONAFB
10
- - Stable 12GB build (example): https://huggingface.co/ghostai1/GHOSTSONAFB/blob/main/STABLE12gb3060.py
11
- - 30s large script: https://huggingface.co/ghostai1/GHOSTSONAFB/blob/main/stable12gblg30sec.py
12
-
13
- ## Notes
14
- - GPU: CUDA-capable, 12GB+ VRAM recommended.
15
- - The app exposes an API on `:8555`:
16
- - `GET /genres` — list available presets from `prompts.ini`
17
- - `GET /prompt/{name}` — generate a prompt string (query params: `bpm`, `drum_beat`, `synthesizer`, `rhythmic_steps`, `bass_style`, `guitar_style`)
18
- - **Aliases from INI** (examples):
19
- - `/set_classic_rock_prompt` → Metallica
20
- - `/set_nirvana_grunge_prompt` → Nirvana
21
- - `/set_pearl_jam_grunge_prompt` → Pearl Jam
22
- - `/set_soundgarden_grunge_prompt` Soundgarden
23
- - `/set_foo_fighters_prompt` → Foo Fighters
24
- - `/set_star_wars_prompt` → Cinematic Star Wars-style orchestral
25
- - `POST /render` — render an MP3. Body includes `instrumental_prompt` and optional overrides (duration, temperature, etc.).
 
1
+ <!-- docs/example_page.md -->
2
+ # GhostAI Music Generator — Quick Links
3
+
4
+ - **MusicGen Large (Meta):** https://huggingface.co/facebook/musicgen-large
5
+ - **GhostAI assets & scripts:**
6
+ - Repo hub: https://huggingface.co/ghostai1/GHOSTSONAFB
7
+ - Stable 12GB build (example): https://huggingface.co/ghostai1/GHOSTSONAFB/blob/main/STABLE12gb3060.py
8
+ - 30s large script: https://huggingface.co/ghostai1/GHOSTSONAFB/blob/main/stable12gblg30sec.py
9
+
10
+ ## Notes
11
+ - GPU: CUDA-capable, 12GB+ VRAM recommended.
12
+ - The app exposes an API on `:8555`:
13
+ - `GET /genres` — list available presets from `prompts.ini`
14
+ - `GET /prompt/{name}` — generate a prompt string (query params: `bpm`, `drum_beat`, `synthesizer`, `rhythmic_steps`, `bass_style`, `guitar_style`)
15
+ - **Aliases from INI** (examples):
16
+ - `/set_classic_rock_prompt` Metallica
17
+ - `/set_nirvana_grunge_prompt` Nirvana
18
+ - `/set_pearl_jam_grunge_prompt` Pearl Jam
19
+ - `/set_soundgarden_grunge_prompt` → Soundgarden
20
+ - `/set_foo_fighters_prompt` → Foo Fighters
21
+ - `/set_star_wars_prompt` → Cinematic Star Wars-style orchestral
22
+ - `POST /render` — render an MP3. Body includes `instrumental_prompt` and optional overrides (duration, temperature, etc.).
 
 
 
public/prompts.ini CHANGED
@@ -1,18 +1,22 @@
1
- # prompts.ini
2
- # Centralized prompt knobs for buttons + API aliases
3
- # Add/adjust sections; the app auto-loads buttons and endpoints.
4
 
5
  [metallica]
 
6
  bpm_min=90
7
  bpm_max=140
8
  drum_beat=standard rock,techno kick
9
  synthesizer=none
10
  rhythmic_steps=steady steps,complex steps
11
  bass_style=deep bass,melodic bass
12
- guitar_style=distorted
 
 
13
  api_name=/set_classic_rock_prompt
 
14
 
15
  [nirvana]
 
16
  bpm_min=100
17
  bpm_max=130
18
  drum_beat=standard rock
@@ -20,9 +24,13 @@ synthesizer=none
20
  rhythmic_steps=steady steps
21
  bass_style=deep bass
22
  guitar_style=distorted,clean
 
 
23
  api_name=/set_nirvana_grunge_prompt
 
24
 
25
  [pearl_jam]
 
26
  bpm_min=100
27
  bpm_max=140
28
  drum_beat=standard rock
@@ -30,9 +38,13 @@ synthesizer=none
30
  rhythmic_steps=steady steps,syncopated steps
31
  bass_style=melodic bass
32
  guitar_style=clean,distorted
 
 
33
  api_name=/set_pearl_jam_grunge_prompt
 
34
 
35
  [soundgarden]
 
36
  bpm_min=90
37
  bpm_max=130
38
  drum_beat=standard rock
@@ -40,9 +52,13 @@ synthesizer=none
40
  rhythmic_steps=complex steps
41
  bass_style=deep bass
42
  guitar_style=distorted
 
 
43
  api_name=/set_soundgarden_grunge_prompt
 
44
 
45
  [foo_fighters]
 
46
  bpm_min=110
47
  bpm_max=150
48
  drum_beat=standard rock
@@ -50,17 +66,147 @@ synthesizer=none
50
  rhythmic_steps=steady steps
51
  bass_style=melodic bass
52
  guitar_style=distorted,clean
 
 
53
  api_name=/set_foo_fighters_prompt
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
- # New: Cinematic / Star Wars-inspired classical
56
- # Optional 'styles' enhances descriptive tags for orchestral color.
57
- [star_wars_classical]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
  bpm_min=84
59
- bpm_max=126
60
- drum_beat=orchestral percussion
61
  synthesizer=none
62
- rhythmic_steps=steady steps,complex steps
63
- bass_style=contrabass ostinato
64
  guitar_style=none
65
- styles=heroic brass,sweeping strings,soaring horns,timpani rolls,choir pads
66
- api_name=/set_star_wars_prompt
 
 
 
1
+ ; All styles live here. Buttons + API endpoints are auto-generated from this file.
2
+ ; Add/modify sections and restart app to reflect changes.
 
3
 
4
  [metallica]
5
+ label=Metallica (Thrash)
6
  bpm_min=90
7
  bpm_max=140
8
  drum_beat=standard rock,techno kick
9
  synthesizer=none
10
  rhythmic_steps=steady steps,complex steps
11
  bass_style=deep bass,melodic bass
12
+ guitar_style=distorted,downpicked,thrash riffing
13
+ mood=aggressive,driving,epic
14
+ structure=intro,verse,chorus,solo,outro
15
  api_name=/set_classic_rock_prompt
16
+ prompt_template=Instrumental thrash metal by Metallica{guitar}{bass}{drum}{synth}{rhythm}, {mood} {section} at {bpm} BPM.
17
 
18
  [nirvana]
19
+ label=Nirvana (Grunge)
20
  bpm_min=100
21
  bpm_max=130
22
  drum_beat=standard rock
 
24
  rhythmic_steps=steady steps
25
  bass_style=deep bass
26
  guitar_style=distorted,clean
27
+ mood=raw,lo-fi,urgent
28
+ structure=intro,verse,chorus,bridge,outro
29
  api_name=/set_nirvana_grunge_prompt
30
+ prompt_template=Instrumental grunge by Nirvana{guitar}{bass}{drum}{synth}{rhythm}, {mood} {section} at {bpm} BPM.
31
 
32
  [pearl_jam]
33
+ label=Pearl Jam (Grunge)
34
  bpm_min=100
35
  bpm_max=140
36
  drum_beat=standard rock
 
38
  rhythmic_steps=steady steps,syncopated steps
39
  bass_style=melodic bass
40
  guitar_style=clean,distorted
41
+ mood=emotional,anthemic,nostalgic
42
+ structure=intro,verse,chorus,solo,outro
43
  api_name=/set_pearl_jam_grunge_prompt
44
+ prompt_template=Instrumental grunge by Pearl Jam{guitar}{bass}{drum}{synth}{rhythm}, {mood} {section} at {bpm} BPM.
45
 
46
  [soundgarden]
47
+ label=Soundgarden (Grunge/Alt Metal)
48
  bpm_min=90
49
  bpm_max=130
50
  drum_beat=standard rock
 
52
  rhythmic_steps=complex steps
53
  bass_style=deep bass
54
  guitar_style=distorted
55
+ mood=heavy,psychedelic,sludgy
56
+ structure=intro,riff,verse,chorus,outro
57
  api_name=/set_soundgarden_grunge_prompt
58
+ prompt_template=Instrumental grunge with heavy metal influences by Soundgarden{guitar}{bass}{drum}{synth}{rhythm}, {mood} {section} at {bpm} BPM.
59
 
60
  [foo_fighters]
61
+ label=Foo Fighters (Alt Rock)
62
  bpm_min=110
63
  bpm_max=150
64
  drum_beat=standard rock
 
66
  rhythmic_steps=steady steps
67
  bass_style=melodic bass
68
  guitar_style=distorted,clean
69
+ mood=anthemic,energetic,hooky
70
+ structure=intro,verse,pre-chorus,chorus,outro
71
  api_name=/set_foo_fighters_prompt
72
+ prompt_template=Instrumental alternative rock with post-grunge influences by Foo Fighters{guitar}{bass}{drum}{synth}{rhythm}, {mood} {section} at {bpm} BPM.
73
+
74
+ [red_hot_chili_peppers]
75
+ label=Red Hot Chili Peppers (Funk Rock)
76
+ bpm_min=90
77
+ bpm_max=130
78
+ drum_beat=standard rock,funk groove
79
+ synthesizer=none
80
+ rhythmic_steps=steady steps,syncopated steps
81
+ bass_style=slap bass,melodic bass
82
+ guitar_style=clean,distorted
83
+ mood=funky,playful,energetic
84
+ structure=intro,verse,chorus,breakdown,outro
85
+ api_name=/set_rhcp_prompt
86
+ prompt_template=Instrumental funk rock by Red Hot Chili Peppers{guitar}{bass}{drum}{synth}{rhythm}, {mood} {section} at {bpm} BPM.
87
+
88
+ [smashing_pumpkins]
89
+ label=Smashing Pumpkins (Alt)
90
+ bpm_min=90
91
+ bpm_max=130
92
+ drum_beat=standard rock
93
+ synthesizer=lush synths,analog synth
94
+ rhythmic_steps=steady steps
95
+ bass_style=melodic bass
96
+ guitar_style=dreamy,clean,distorted
97
+ mood=dreamy,layered,melancholic
98
+ structure=intro,verse,chorus,bridge,outro
99
+ api_name=/set_smashing_pumpkins_prompt
100
+ prompt_template=Instrumental alternative rock by Smashing Pumpkins{guitar}{bass}{drum}{synth}{rhythm}, {mood} {section} at {bpm} BPM.
101
+
102
+ [radiohead]
103
+ label=Radiohead (Experimental)
104
+ bpm_min=80
105
+ bpm_max=120
106
+ drum_beat=standard rock
107
+ synthesizer=atmospheric synths,digital pad
108
+ rhythmic_steps=syncopated steps,steady steps
109
+ bass_style=hypnotic bass,melodic bass
110
+ guitar_style=clean,experimental
111
+ mood=moody,atmospheric,introspective
112
+ structure=intro,texture build,groove,release,outro
113
+ api_name=/set_radiohead_prompt
114
+ prompt_template=Instrumental experimental rock by Radiohead{synth}{bass}{drum}{guitar}{rhythm}, {mood} {section} at {bpm} BPM.
115
 
116
+ [alternative_rock]
117
+ label=Alternative Rock (Pixies)
118
+ bpm_min=100
119
+ bpm_max=140
120
+ drum_beat=standard rock
121
+ synthesizer=none
122
+ rhythmic_steps=steady steps
123
+ bass_style=melodic bass
124
+ guitar_style=distorted,clean
125
+ mood=dynamic,loud-quiet-loud,edgy
126
+ structure=intro,verse,chorus,quiet break,outro
127
+ api_name=/set_alternative_rock_prompt
128
+ prompt_template=Instrumental alternative rock by Pixies{guitar}{bass}{drum}{synth}{rhythm}, {mood} {section} at {bpm} BPM.
129
+
130
+ [post_punk]
131
+ label=Post-Punk (Joy Division)
132
+ bpm_min=90
133
+ bpm_max=130
134
+ drum_beat=precise drums,standard rock
135
+ synthesizer=analog synth
136
+ rhythmic_steps=steady steps
137
+ bass_style=driving bass,melodic bass
138
+ guitar_style=jangle,clean
139
+ mood=dark,minimal,propulsive
140
+ structure=intro,groove,build,climax,outro
141
+ api_name=/set_post_punk_prompt
142
+ prompt_template=Instrumental post-punk by Joy Division{guitar}{bass}{drum}{synth}{rhythm}, {mood} {section} at {bpm} BPM.
143
+
144
+ [indie_rock]
145
+ label=Indie Rock (Arctic Monkeys)
146
+ bpm_min=100
147
+ bpm_max=140
148
+ drum_beat=standard rock
149
+ synthesizer=none
150
+ rhythmic_steps=steady steps,syncopated steps
151
+ bass_style=groovy bass,melodic bass
152
+ guitar_style=jangle,clean,distorted
153
+ mood=swagger,catchy,modern
154
+ structure=intro,verse,chorus,bridge,outro
155
+ api_name=/set_indie_rock_prompt
156
+ prompt_template=Instrumental indie rock by Arctic Monkeys{guitar}{bass}{drum}{synth}{rhythm}, {mood} {section} at {bpm} BPM.
157
+
158
+ [funk_rock]
159
+ label=Funk Rock (RATM)
160
+ bpm_min=90
161
+ bpm_max=120
162
+ drum_beat=heavy drums,funk groove
163
+ synthesizer=none
164
+ rhythmic_steps=syncopated steps
165
+ bass_style=slap bass,deep bass
166
+ guitar_style=funky,distorted
167
+ mood=aggressive,groovy,radical
168
+ structure=intro,riff,verse,drop,chorus,outro
169
+ api_name=/set_funk_rock_prompt
170
+ prompt_template=Instrumental funk rock by Rage Against the Machine{guitar}{bass}{drum}{synth}{rhythm}, {mood} {section} at {bpm} BPM.
171
+
172
+ [detroit_techno]
173
+ label=Detroit Techno
174
+ bpm_min=120
175
+ bpm_max=135
176
+ drum_beat=four-on-the-floor,techno kick
177
+ synthesizer=pulsing synths,analog synth,arpeggiated synth
178
+ rhythmic_steps=steady steps
179
+ bass_style=driving bass,deep bass
180
+ guitar_style=none
181
+ mood=machine-soul,minimal,hypnotic
182
+ structure=intro,groove,filter rise,peak,ride out
183
+ api_name=/set_detroit_techno_prompt
184
+ prompt_template=Instrumental Detroit techno by Juan Atkins{synth}{bass}{drum}{rhythm}, {mood} {section} at {bpm} BPM.
185
+
186
+ [deep_house]
187
+ label=Deep House
188
+ bpm_min=118
189
+ bpm_max=125
190
+ drum_beat=steady kick,four-on-the-floor
191
+ synthesizer=warm synths,analog pad
192
+ rhythmic_steps=steady steps
193
+ bass_style=deep bass,subby bass
194
+ guitar_style=none
195
+ mood=warm,late-night,groovy
196
+ structure=intro,groove,breakdown,drop,outro
197
+ api_name=/set_deep_house_prompt
198
+ prompt_template=Instrumental deep house by Larry Heard{synth}{bass}{drum}{rhythm}, {mood} {section} at {bpm} BPM.
199
+
200
+ [classical_star_wars]
201
+ label=Classical (Star Wars Suite)
202
  bpm_min=84
203
+ bpm_max=132
204
+ drum_beat=orchestral percussion,tympani
205
  synthesizer=none
206
+ rhythmic_steps=martial march,staccato ostinato,triplet swells
207
+ bass_style=low brass,cellos,double basses
208
  guitar_style=none
209
+ mood=heroic,epic,cinematic
210
+ structure=fanfare,motif development,crescendo,reprise,coda
211
+ api_name=/set_classical_star_wars_prompt
212
+ prompt_template=Cinematic orchestral score{drum}{rhythm}, {mood} {section} with brass fanfares, soaring strings, woodwinds and timpani; no vocals; tempo reference {bpm} BPM.
public/publicapi.py CHANGED
@@ -1,4 +1,3 @@
1
- # app.py
2
  #!/usr/bin/env python3
3
  # -*- coding: utf-8 -*-
4
 
@@ -8,24 +7,25 @@ import gc
8
  import re
9
  import json
10
  import time
11
- import math
12
  import mmap
13
- import torch
 
14
  import random
15
  import logging
16
  import warnings
17
  import traceback
18
  import subprocess
19
- import tempfile
 
 
20
  import numpy as np
 
21
  import torchaudio
22
  import gradio as gr
23
  import gradio_client.utils
24
- import configparser
25
  from pydub import AudioSegment
26
  from datetime import datetime
27
  from pathlib import Path
28
- from typing import Optional, Tuple, Dict, Any, List
29
  from torch.cuda.amp import autocast
30
 
31
  from fastapi import FastAPI, HTTPException, Query
@@ -33,10 +33,29 @@ from fastapi.middleware.cors import CORSMiddleware
33
  from pydantic import BaseModel
34
  import uvicorn
35
  import threading
36
- from logging.handlers import RotatingFileHandler
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
 
38
  # ======================================================================================
39
- # RUNTIME, LOGGING, PATCHES
40
  # ======================================================================================
41
 
42
  _original_get_type = gradio_client.utils.get_type
@@ -51,43 +70,85 @@ os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128"
51
  torch.backends.cudnn.benchmark = False
52
  torch.backends.cudnn.deterministic = True
53
 
54
- LOG_DIR = "logs"
55
- MP3_DIR = "mp3"
56
- os.makedirs(LOG_DIR, exist_ok=True)
57
- os.makedirs(MP3_DIR, exist_ok=True)
 
58
 
59
- LOG_FILE = os.path.join(LOG_DIR, "ghostai_musicgen.log")
60
- logger = logging.getLogger("ghostai-musicgen")
61
- logger.setLevel(logging.DEBUG)
62
- logger.handlers = [] # prevent duplicate handlers on hot-reload
63
-
64
- file_handler = RotatingFileHandler(
65
- LOG_FILE,
66
- maxBytes=5 * 1024 * 1024, # 5 MB cap
67
- backupCount=0, # single file only; truncate on rollover
68
- encoding="utf-8",
69
- delay=True
70
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
  file_handler.setFormatter(logging.Formatter("%(asctime)s [%(levelname)s] %(message)s"))
72
- stdout_handler = logging.StreamHandler(sys.stdout)
73
- stdout_handler.setFormatter(logging.Formatter("%(asctime)s [%(levelname)s] %(message)s"))
74
- logger.addHandler(file_handler)
75
- logger.addHandler(stdout_handler)
 
 
 
 
76
 
77
  DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
78
  if DEVICE != "cuda":
79
- logger.error("CUDA GPU is required. Exiting.")
80
  sys.exit(1)
81
  logger.info(f"GPU: {torch.cuda.get_device_name(0)}")
 
82
 
83
  # ======================================================================================
84
  # SETTINGS PERSISTENCE
85
  # ======================================================================================
86
 
87
- SETTINGS_FILE = "settings.json"
88
- PROMPTS_INI = "prompts.ini"
89
- STYLES_CSS = "styles.css"
90
-
91
  DEFAULT_SETTINGS: Dict[str, Any] = {
92
  "cfg_scale": 5.8,
93
  "top_k": 250,
@@ -109,119 +170,31 @@ DEFAULT_SETTINGS: Dict[str, Any] = {
109
  "instrumental_prompt": ""
110
  }
111
 
112
- def load_settings() -> Dict[str, Any]:
113
  try:
114
- if os.path.exists(SETTINGS_FILE):
115
- with open(SETTINGS_FILE, "r") as f:
116
  data = json.load(f)
117
  for k, v in DEFAULT_SETTINGS.items():
118
  data.setdefault(k, v)
119
  logger.info(f"Loaded settings from {SETTINGS_FILE}")
120
  return data
121
  except Exception as e:
122
- logger.error(f"Settings load failed: {e}")
123
  return DEFAULT_SETTINGS.copy()
124
 
125
- def save_settings(s: Dict[str, Any]) -> None:
126
  try:
127
- with open(SETTINGS_FILE, "w") as f:
128
- json.dump(s, f, indent=2)
129
  logger.info(f"Saved settings to {SETTINGS_FILE}")
130
  except Exception as e:
131
- logger.error(f"Settings save failed: {e}")
132
-
133
- SETTINGS = load_settings()
134
-
135
- # ======================================================================================
136
- # PROMPT CONFIG (prompts.ini)
137
- # ======================================================================================
138
 
139
- def _csv_list(s: str) -> List[str]:
140
- if not s or s.strip().lower() == "none":
141
- return []
142
- return [x.strip() for x in s.split(",") if x.strip()]
143
-
144
- PROMPT_CFG = configparser.ConfigParser()
145
- if not os.path.exists(PROMPTS_INI):
146
- PROMPT_CFG["metallica"] = {
147
- "bpm_min": "90", "bpm_max": "140",
148
- "drum_beat": "standard rock,techno kick",
149
- "synthesizer": "none",
150
- "rhythmic_steps": "steady steps,complex steps",
151
- "bass_style": "deep bass,melodic bass",
152
- "guitar_style": "distorted",
153
- "api_name": "/set_classic_rock_prompt"
154
- }
155
- with open(PROMPTS_INI, "w") as f:
156
- PROMPT_CFG.write(f)
157
- PROMPT_CFG.read(PROMPTS_INI)
158
-
159
- def list_genres() -> List[str]:
160
- return PROMPT_CFG.sections()
161
-
162
- def get_api_aliases() -> Dict[str, str]:
163
- out = {}
164
- for sec in PROMPT_CFG.sections():
165
- api_name = PROMPT_CFG.get(sec, "api_name", fallback="").strip()
166
- if api_name:
167
- out[api_name] = sec
168
- return out
169
-
170
- def _humanize(name: str) -> str:
171
- return name.replace("_", " ").title()
172
-
173
- def build_prompt_from_section(
174
- section: str,
175
- bpm: Optional[int] = None,
176
- drum_beat: Optional[str] = None,
177
- synthesizer: Optional[str] = None,
178
- rhythmic_steps: Optional[str] = None,
179
- bass_style: Optional[str] = None,
180
- guitar_style: Optional[str] = None
181
- ) -> str:
182
- if section not in PROMPT_CFG:
183
- return f"Instrumental track at 120 BPM."
184
- cfg = PROMPT_CFG[section]
185
- bpm_min = cfg.getint("bpm_min", fallback=100)
186
- bpm_max = cfg.getint("bpm_max", fallback=130)
187
- bpm = bpm if bpm else random.randint(bpm_min, bpm_max)
188
- bpm = max(bpm_min, min(bpm_max, bpm))
189
-
190
- def pick(value: Optional[str], pool_key: str) -> str:
191
- pool = _csv_list(cfg.get(pool_key, fallback=""))
192
- if not pool:
193
- return "" if (not value or value == "none") else f", {value}"
194
- if (not value) or value == "none" or value not in pool:
195
- choice = random.choice(pool)
196
- return "" if choice == "none" else f", {choice}"
197
- return f", {value}"
198
-
199
- drum = pick(drum_beat, "drum_beat")
200
- synth = pick(synthesizer, "synthesizer")
201
- steps = pick(rhythmic_steps, "rhythmic_steps")
202
- bass = pick(bass_style, "bass_style")
203
- guitar = pick(guitar_style, "guitar_style")
204
-
205
- styles_csv = cfg.get("styles", fallback="").strip()
206
- styles_str = ""
207
- if styles_csv:
208
- styles = _csv_list(styles_csv)
209
- if styles:
210
- styles_str = ", " + ", ".join(styles)
211
-
212
- label = _humanize(section)
213
- if "star_wars" in section or "classical" in section:
214
- return (
215
- f"Cinematic orchestral score{styles_str}{drum}{synth}{steps}{bass}{guitar}, "
216
- f"space-opera energy, sweeping strings, heroic brass, bold timpani at {bpm} BPM."
217
- )
218
- return (
219
- f"Instrumental {label}{guitar}{bass}{drum}{synth}{steps} at {bpm} BPM, "
220
- f"dynamic sections (intro/verse/chorus), cohesive song flow."
221
- )
222
 
223
  # ======================================================================================
224
- # VRAM / DISK / CLEANUP
225
  # ======================================================================================
226
 
227
  def clean_memory() -> Optional[float]:
@@ -230,9 +203,12 @@ def clean_memory() -> Optional[float]:
230
  gc.collect()
231
  torch.cuda.ipc_collect()
232
  torch.cuda.synchronize()
233
- return torch.cuda.memory_allocated() / 1024**2
 
 
234
  except Exception as e:
235
  logger.error(f"clean_memory failed: {e}")
 
236
  return None
237
 
238
  def check_vram():
@@ -246,6 +222,13 @@ def check_vram():
246
  used_mb, total_mb = map(int, re.findall(r'\d+', lines[1]))
247
  free_mb = total_mb - used_mb
248
  logger.info(f"VRAM: used {used_mb} MiB | free {free_mb} MiB | total {total_mb} MiB")
 
 
 
 
 
 
 
249
  return free_mb
250
  except Exception as e:
251
  logger.error(f"check_vram failed: {e}")
@@ -263,209 +246,336 @@ def check_disk_space(path=".") -> bool:
263
  return False
264
 
265
  # ======================================================================================
266
- # MODEL LOAD
267
- # ======================================================================================
268
-
269
- try:
270
- from audiocraft.models import MusicGen
271
- except Exception as e:
272
- logger.error("audiocraft is required. pip install audiocraft")
273
- raise
274
-
275
- def load_model():
276
- free_vram = check_vram()
277
- if free_vram is not None and free_vram < 5000:
278
- logger.warning("Low free VRAM; consider closing other GPU apps.")
279
- clean_memory()
280
- local_model_path = "./models/musicgen-large"
281
- if not os.path.exists(local_model_path):
282
- logger.error(f"Missing weights at {local_model_path}")
283
- sys.exit(1)
284
- logger.info("Loading MusicGen (large)...")
285
- with autocast(dtype=torch.float16):
286
- model = MusicGen.get_pretrained(local_model_path, device=DEVICE)
287
- model.set_generation_params(duration=30, two_step_cfg=False)
288
- logger.info("MusicGen loaded.")
289
- return model
290
-
291
- musicgen_model = load_model()
292
-
293
- # ======================================================================================
294
- # AUDIO DSP
295
  # ======================================================================================
296
 
297
- def ensure_stereo(seg: AudioSegment, sample_rate=48000, sample_width=2) -> AudioSegment:
298
  try:
299
- if seg.channels != 2:
300
- seg = seg.set_channels(2)
301
- if seg.frame_rate != sample_rate:
302
- seg = seg.set_frame_rate(sample_rate)
303
- return seg
304
- except Exception:
305
- return seg
 
306
 
307
- def calculate_rms(seg: AudioSegment) -> float:
308
  try:
309
- samples = np.array(seg.get_array_of_samples(), dtype=np.float32)
310
  return float(np.sqrt(np.mean(samples**2)))
311
- except Exception:
 
312
  return 0.0
313
 
314
- def hard_limit(seg: AudioSegment, limit_db=-3.0, sample_rate=48000) -> AudioSegment:
315
  try:
316
- seg = ensure_stereo(seg, sample_rate, seg.sample_width)
317
- limit = 10 ** (limit_db / 20.0) * (2**23 if seg.sample_width == 3 else 32767)
318
- x = np.array(seg.get_array_of_samples(), dtype=np.float32)
319
- x = np.clip(x, -limit, limit).astype(np.int32 if seg.sample_width == 3 else np.int16)
320
- if len(x) % 2 != 0:
321
- x = x[:-1]
322
- return AudioSegment(x.tobytes(), frame_rate=sample_rate, sample_width=seg.sample_width, channels=2)
323
- except Exception:
324
- return seg
 
 
 
 
 
 
325
 
326
- def rms_normalize(seg: AudioSegment, target_rms_db=-23.0, peak_limit_db=-3.0, sample_rate=48000) -> AudioSegment:
327
  try:
328
- seg = ensure_stereo(seg, sample_rate, seg.sample_width)
329
- target = 10 ** (target_rms_db / 20) * (2**23 if seg.sample_width == 3 else 32767)
330
- current = calculate_rms(seg)
331
- if current > 0:
332
- gain = target / current
333
- seg = seg.apply_gain(20 * np.log10(max(gain, 1e-6)))
334
- seg = hard_limit(seg, limit_db=peak_limit_db, sample_rate=sample_rate)
335
- return seg
336
- except Exception:
337
- return seg
 
338
 
339
- def balance_stereo(seg: AudioSegment, noise_threshold=-40, sample_rate=48000) -> AudioSegment:
340
  try:
341
- seg = ensure_stereo(seg, sample_rate, seg.sample_width)
342
- x = np.array(seg.get_array_of_samples(), dtype=np.float32)
343
- stereo = x.reshape(-1, 2)
 
 
344
  db = 20 * np.log10(np.abs(stereo) + 1e-10)
345
  mask = db > noise_threshold
346
  stereo = stereo * mask
347
- L, R = stereo[:, 0], stereo[:, 1]
348
- l_rms = np.sqrt(np.mean(L[L != 0] ** 2)) if np.any(L != 0) else 0
349
- r_rms = np.sqrt(np.mean(R[R != 0] ** 2)) if np.any(R != 0) else 0
 
350
  if l_rms > 0 and r_rms > 0:
351
  avg = (l_rms + r_rms) / 2
352
  stereo[:, 0] *= (avg / l_rms)
353
  stereo[:, 1] *= (avg / r_rms)
354
- out = stereo.flatten().astype(np.int32 if seg.sample_width == 3 else np.int16)
355
  if len(out) % 2 != 0:
356
  out = out[:-1]
357
- return AudioSegment(out.tobytes(), frame_rate=sample_rate, sample_width=seg.sample_width, channels=2)
358
- except Exception:
359
- return seg
 
 
 
 
 
 
360
 
361
- def apply_noise_gate(seg: AudioSegment, threshold_db=-80, sample_rate=48000) -> AudioSegment:
362
  try:
363
- seg = ensure_stereo(seg, sample_rate, seg.sample_width)
364
- x = np.array(seg.get_array_of_samples(), dtype=np.float32)
365
- stereo = x.reshape(-1, 2)
 
 
366
  for _ in range(2):
367
  db = 20 * np.log10(np.abs(stereo) + 1e-10)
368
  mask = db > threshold_db
369
  stereo = stereo * mask
370
- out = stereo.flatten().astype(np.int32 if seg.sample_width == 3 else np.int16)
371
  if len(out) % 2 != 0:
372
  out = out[:-1]
373
- return AudioSegment(out.tobytes(), frame_rate=sample_rate, sample_width=seg.sample_width, channels=2)
374
- except Exception:
375
- return seg
 
 
 
 
 
 
376
 
377
- def apply_eq(seg: AudioSegment, sample_rate=48000) -> AudioSegment:
378
  try:
379
- seg = ensure_stereo(seg, sample_rate, seg.sample_width)
380
- seg = seg.high_pass_filter(20).low_pass_filter(8000)
381
- seg = seg - 3
382
- seg = seg - 3
383
- seg = seg - 10
384
- return seg
385
- except Exception:
386
- return seg
 
 
387
 
388
- def apply_fade(seg: AudioSegment, fade_in_ms=500, fade_out_ms=800) -> AudioSegment:
389
  try:
390
- seg = ensure_stereo(seg, seg.frame_rate, seg.sample_width)
391
- return seg.fade_in(fade_in_ms).fade_out(fade_out_ms)
392
- except Exception:
393
- return seg
 
 
394
 
395
- def _export_tensor_to_segment(audio: torch.Tensor, sr: int, bit_depth: int) -> Optional[AudioSegment]:
396
- tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".wav")
397
- tmp_path = tmp.name
398
- tmp.close()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
399
  try:
400
- torchaudio.save(tmp_path, audio, sr, bits_per_sample=bit_depth)
401
  with open(tmp_path, "rb") as f:
402
  mm = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
403
  seg = AudioSegment.from_wav(tmp_path)
404
  mm.close()
405
  return seg
406
  except Exception as e:
407
- logger.error(f"export tensor -> segment failed: {e}")
 
408
  return None
409
  finally:
410
  try:
411
- if os.path.exists(tmp_path): os.unlink(tmp_path)
412
  except OSError:
413
  pass
414
 
415
- def _crossfade(seg_a: AudioSegment, seg_b: AudioSegment, overlap_ms: int, sr: int, bit_depth: int) -> AudioSegment:
416
  try:
417
- seg_a = ensure_stereo(seg_a, sr, seg_a.sample_width)
418
- seg_b = ensure_stereo(seg_b, sr, seg_b.sample_width)
419
  if overlap_ms <= 0 or len(seg_a) < overlap_ms or len(seg_b) < overlap_ms:
420
  return seg_a + seg_b
421
 
422
- with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as a_wav, \
423
- tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as b_wav, \
424
- tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as cf_wav:
425
- a_path = a_wav.name
426
- b_path = b_wav.name
427
- cf_path = cf_wav.name
428
-
429
- seg_a[-overlap_ms:].export(a_path, format="wav")
430
- seg_b[:overlap_ms].export(b_path, format="wav")
431
- a, sr_a = torchaudio.load(a_path)
432
- b, sr_b = torchaudio.load(b_path)
433
- if sr_a != sr:
434
- a = torchaudio.functional.resample(a, sr_a, sr, lowpass_filter_width=64)
435
- if sr_b != sr:
436
- b = torchaudio.functional.resample(b, sr_b, sr, lowpass_filter_width=64)
437
- n = min(a.shape[1], b.shape[1])
438
- n = n - (n % 2)
439
- if n <= 0:
440
- for p in (a_path, b_path, cf_path):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
441
  try:
442
- if os.path.exists(p): os.unlink(p)
 
443
  except OSError:
444
  pass
445
- return seg_a + seg_b
446
- aw = a[:, :n].to(torch.float32)
447
- bw = b[:, :n].to(torch.float32)
448
- hann = torch.hann_window(n, periodic=False)
449
- out = (aw * hann.flip(0) + bw * hann).clamp(-1.0, 1.0)
450
- scale = (2**23 if bit_depth == 24 else 32767)
451
- out_i = (out * scale).to(torch.int32 if bit_depth == 24 else torch.int16)
452
- torchaudio.save(cf_path, out_i, sr, bits_per_sample=bit_depth)
453
- blended = AudioSegment.from_wav(cf_path)
454
- res = seg_a[:-overlap_ms] + blended + seg_b[overlap_ms:]
455
- for p in (a_path, b_path, cf_path):
456
- try:
457
- if os.path.exists(p): os.unlink(p)
458
- except OSError:
459
- pass
460
- return res
461
  except Exception as e:
462
- logger.error(f"crossfade failed: {e}")
463
  return seg_a + seg_b
464
 
465
- # ======================================================================================
466
- # GENERATION (30s chunks -> seamless)
467
- # ======================================================================================
468
-
469
  def generate_music(
470
  instrumental_prompt: str,
471
  cfg_scale: float,
@@ -481,178 +591,196 @@ def generate_music(
481
  guitar_style: str,
482
  target_volume: float,
483
  preset: str,
484
- max_steps_ignored: str,
485
  vram_status_text: str,
486
  bitrate: str,
487
  output_sample_rate: str,
488
  bit_depth: str
489
  ) -> Tuple[Optional[str], str, str]:
 
 
490
  if not instrumental_prompt or not instrumental_prompt.strip():
491
- return None, "⚠️ Enter a valid prompt.", vram_status_text
492
 
493
  try:
494
- out_sr = int(output_sample_rate)
495
- bit_depth_int = int(bit_depth)
496
- sample_width = 3 if bit_depth_int == 24 else 2
497
- except Exception:
498
- return None, "❌ Invalid output SR or bit depth.", vram_status_text
499
-
500
- if not check_disk_space("."):
501
- return None, "⚠️ Low disk space (<1GB).", vram_status_text
502
-
503
- CHUNK = 30
504
- total_duration = max(30, min(int(total_duration), 180))
505
- chunks = math.ceil(total_duration / CHUNK)
506
- PROCESS_SR = 48000
507
- OVERLAP = 0.20
508
-
509
- musicgen_model.set_generation_params(
510
- duration=CHUNK,
511
- use_sampling=True,
512
- top_k=int(top_k),
513
- top_p=float(top_p),
514
- temperature=float(temperature),
515
- cfg_coef=float(cfg_scale),
516
- two_step_cfg=False,
517
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
518
 
519
- vram_status_text = f"Start VRAM: {torch.cuda.memory_allocated() / 1024**2:.2f} MB"
520
- segments: List[AudioSegment] = []
521
 
522
- seed = random.randint(0, 2**31 - 1)
523
- random.seed(seed); np.random.seed(seed)
524
- torch.manual_seed(seed); torch.cuda.manual_seed_all(seed)
525
 
526
- for i in range(chunks):
527
- part = i + 1
528
- dur = CHUNK if (i < chunks - 1) else (total_duration - CHUNK * (chunks - 1) or CHUNK)
529
- logger.info(f"Generating chunk {part}/{chunks} ({dur}s)")
530
- chunk_prompt = instrumental_prompt
531
 
532
- try:
533
- with torch.no_grad():
534
- with autocast(dtype=torch.float16):
535
- clean_memory()
536
- if i == 0:
537
- audio = musicgen_model.generate([chunk_prompt], progress=True)[0].cpu()
538
- else:
539
- prev = segments[-1]
540
- prev = apply_noise_gate(prev, -80, PROCESS_SR)
541
- prev = balance_stereo(prev, -40, PROCESS_SR)
542
- with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as tprev:
543
- prev_path = tprev.name
544
- prev.export(prev_path, format="wav")
545
- tail, sr_prev = torchaudio.load(prev_path)
546
- if sr_prev != PROCESS_SR:
547
- tail = torchaudio.functional.resample(tail, sr_prev, PROCESS_SR, lowpass_filter_width=64)
548
- if tail.shape[0] != 2:
549
- tail = tail.repeat(2, 1)[:, :tail.shape[1]]
550
- try:
551
- os.unlink(prev_path)
552
- except OSError:
553
- pass
554
- tail = tail.to(DEVICE)[:, -int(PROCESS_SR * OVERLAP):]
555
- audio = musicgen_model.generate_continuation(
556
- prompt=tail,
557
- prompt_sample_rate=PROCESS_SR,
558
- descriptions=[chunk_prompt],
559
- progress=True
560
- )[0].cpu()
561
- clean_memory()
562
- except Exception as e:
563
- logger.error(f"Chunk {part} generation failed: {e}")
564
- return None, f"❌ Failed to generate chunk {part}: {e}", vram_status_text
 
 
 
 
 
 
 
 
 
565
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
566
  try:
567
- if audio.shape[0] != 2:
568
- audio = audio.repeat(2, 1)[:, :audio.shape[1]]
569
- audio = audio.to(torch.float32)
570
- audio = torchaudio.functional.resample(audio, 32000, PROCESS_SR, lowpass_filter_width=64)
571
- seg = _export_tensor_to_segment(audio, PROCESS_SR, bit_depth_int)
572
- if seg is None:
573
- return None, f"❌ Audio conversion failed (chunk {part}).", vram_status_text
574
- seg = ensure_stereo(seg, PROCESS_SR, sample_width)
575
- seg = seg - 15
576
- seg = apply_noise_gate(seg, -80, PROCESS_SR)
577
- seg = balance_stereo(seg, -40, PROCESS_SR)
578
- seg = rms_normalize(seg, target_rms_db=target_volume, peak_limit_db=-3.0, sample_rate=PROCESS_SR)
579
- seg = apply_eq(seg, PROCESS_SR)
580
- seg = seg[:dur * 1000]
581
- segments.append(seg)
582
- del audio
583
- vram_status_text = f"VRAM after chunk {part}: {torch.cuda.memory_allocated() / 1024**2:.2f} MB"
584
  except Exception as e:
585
- logger.error(f"Post-process failed (chunk {part}): {e}")
586
- return None, f"❌ Processing error (chunk {part}).", vram_status_text
587
-
588
- if not segments:
589
- return None, "❌ No audio generated.", vram_status_text
590
-
591
- logger.info("Combining chunks...")
592
- out = segments[0]
593
- overlap_ms = int(OVERLAP * 1000)
594
- for k in range(1, len(segments)):
595
- out = _crossfade(out, segments[k], overlap_ms, PROCESS_SR, bit_depth_int)
596
-
597
- out = out[:total_duration * 1000]
598
- out = apply_noise_gate(out, -80, PROCESS_SR)
599
- out = balance_stereo(out, -40, PROCESS_SR)
600
- out = rms_normalize(out, target_rms_db=target_volume, peak_limit_db=-3.0, sample_rate=PROCESS_SR)
601
- out = apply_eq(out, PROCESS_SR)
602
- out = apply_fade(out, 500, 800)
603
- out = (out - 10).set_frame_rate(out_sr)
604
-
605
- mp3_path = os.path.join(MP3_DIR, f"ghostai_music_{int(time.time())}.mp3")
606
- try:
607
- clean_memory()
608
- out.export(mp3_path, format="mp3", bitrate=bitrate, tags={"title": "GhostAI Instrumental", "artist": "GhostAI"})
609
- except Exception as e:
610
- logger.error(f"MP3 export failed: {e}")
611
- fb = os.path.join(MP3_DIR, f"ghostai_music_fallback_{int(time.time())}.mp3")
612
- try:
613
- out.export(fb, format="mp3", bitrate="128k")
614
- mp3_path = fb
615
- except Exception as ee:
616
- return None, f"❌ Export failed: {ee}", vram_status_text
617
 
618
- vram_status_text = f"Final VRAM: {torch.cuda.memory_allocated() / 1024**2:.2f} MB"
619
- return mp3_path, " Done! Seamless unified track rendered.", vram_status_text
 
 
620
 
621
- def generate_music_wrapper(*args):
622
- try:
623
- return generate_music(*args)
 
624
  finally:
625
  clean_memory()
626
 
 
 
 
 
 
 
 
 
 
627
  # ======================================================================================
628
- # FASTAPI Status + Settings + Prompts + Render
629
  # ======================================================================================
630
 
631
- class RenderRequest(BaseModel):
632
- instrumental_prompt: str
633
- cfg_scale: Optional[float] = None
634
- top_k: Optional[int] = None
635
- top_p: Optional[float] = None
636
- temperature: Optional[float] = None
637
- total_duration: Optional[int] = None
638
- bpm: Optional[int] = None
639
- drum_beat: Optional[str] = None
640
- synthesizer: Optional[str] = None
641
- rhythmic_steps: Optional[str] = None
642
- bass_style: Optional[str] = None
643
- guitar_style: Optional[str] = None
644
- target_volume: Optional[float] = None
645
- preset: Optional[str] = None
646
- max_steps: Optional[int] = None
647
- bitrate: Optional[str] = None
648
- output_sample_rate: Optional[str] = None
649
- bit_depth: Optional[str] = None
650
-
651
- class SettingsUpdate(BaseModel):
652
- settings: Dict[str, Any]
653
-
654
  BUSY_LOCK = threading.Lock()
655
  BUSY_FLAG = False
 
656
  CURRENT_JOB: Dict[str, Any] = {"id": None, "start": None}
657
 
658
  def set_busy(val: bool, job_id: Optional[str] = None):
@@ -662,9 +790,18 @@ def set_busy(val: bool, job_id: Optional[str] = None):
662
  if val:
663
  CURRENT_JOB["id"] = job_id or f"job_{int(time.time())}"
664
  CURRENT_JOB["start"] = time.time()
 
 
 
 
665
  else:
666
  CURRENT_JOB["id"] = None
667
  CURRENT_JOB["start"] = None
 
 
 
 
 
668
 
669
  def is_busy() -> bool:
670
  with BUSY_LOCK:
@@ -676,61 +813,89 @@ def job_elapsed() -> float:
676
  return 0.0
677
  return time.time() - CURRENT_JOB["start"]
678
 
679
- fastapp = FastAPI(title="GhostAI Music Server", version="1.2")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
680
  fastapp.add_middleware(
681
- CORSMiddleware, allow_origins=["*"], allow_credentials=True, allow_methods=["*"], allow_headers=["*"]
 
682
  )
683
 
684
  @fastapp.get("/health")
685
  def health():
686
- return {"ok": True, "ts": int(time.time())}
687
 
688
  @fastapp.get("/status")
689
  def status():
690
- return {"busy": is_busy(), "job_id": CURRENT_JOB["id"], "elapsed": job_elapsed()}
 
 
 
 
 
 
 
 
691
 
692
  @fastapp.get("/config")
693
  def get_config():
694
- return {"defaults": SETTINGS}
695
 
696
  @fastapp.post("/settings")
697
  def set_settings(payload: SettingsUpdate):
698
  try:
699
- s = SETTINGS.copy()
700
  s.update(payload.settings or {})
701
- save_settings(s)
702
  for k, v in s.items():
703
- SETTINGS[k] = v
704
  return {"ok": True, "saved": s}
705
  except Exception as e:
706
  raise HTTPException(status_code=400, detail=str(e))
707
 
708
- @fastapp.get("/genres")
709
- def api_genres():
710
- return {"genres": list_genres()}
 
 
 
 
 
 
 
 
 
 
 
 
 
711
 
712
- @fastapp.post("/reload_prompts")
713
- def api_reload_prompts():
714
- try:
715
- PROMPT_CFG.read(PROMPTS_INI)
716
- return {"ok": True, "genres": list_genres(), "aliases": get_api_aliases()}
717
- except Exception as e:
718
- raise HTTPException(status_code=500, detail=str(e))
719
-
720
- @fastapp.get("/prompt/{name}")
721
- def api_prompt(
722
- name: str,
723
- bpm: Optional[int] = Query(None),
724
- drum_beat: Optional[str] = Query(None),
725
- synthesizer: Optional[str] = Query(None),
726
- rhythmic_steps: Optional[str] = Query(None),
727
- bass_style: Optional[str] = Query(None),
728
- guitar_style: Optional[str] = Query(None),
729
- ):
730
- if name not in PROMPT_CFG:
731
- raise HTTPException(status_code=404, detail=f"Unknown genre '{name}'.")
732
- prompt = build_prompt_from_section(name, bpm, drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style)
733
- return {"name": name, "prompt": prompt}
734
 
735
  @fastapp.post("/render")
736
  def render(req: RenderRequest):
@@ -739,7 +904,7 @@ def render(req: RenderRequest):
739
  job_id = f"render_{int(time.time())}"
740
  set_busy(True, job_id)
741
  try:
742
- s = SETTINGS.copy()
743
  for k, v in req.dict().items():
744
  if v is not None:
745
  s[k] = v
@@ -766,223 +931,283 @@ def render(req: RenderRequest):
766
  )
767
  if not mp3:
768
  raise HTTPException(status_code=500, detail=msg)
769
- return {"ok": True, "job_id": job_id, "path": mp3, "status": msg, "vram": vram}
770
  finally:
771
  set_busy(False, None)
772
 
773
- for path, sec in get_api_aliases().items():
774
- def _factory(section_name: str):
775
- def _endpoint(
776
- bpm: Optional[int] = Query(None),
777
- drum_beat: Optional[str] = Query(None),
778
- synthesizer: Optional[str] = Query(None),
779
- rhythmic_steps: Optional[str] = Query(None),
780
- bass_style: Optional[str] = Query(None),
781
- guitar_style: Optional[str] = Query(None),
782
- ):
783
- prompt = build_prompt_from_section(section_name, bpm, drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style)
784
- return {"name": section_name, "prompt": prompt}
785
- return _endpoint
786
- fastapp.add_api_route(path, _factory(sec), methods=["GET"])
787
-
788
  def _start_fastapi():
789
  uvicorn.run(fastapp, host="0.0.0.0", port=8555, log_level="info")
790
 
 
 
 
 
 
 
 
 
791
  api_thread = threading.Thread(target=_start_fastapi, daemon=True)
792
  api_thread.start()
793
- logger.info("FastAPI server on http://0.0.0.0:8555")
794
 
795
  # ======================================================================================
796
- # GRADIO UI
797
  # ======================================================================================
798
 
799
- def get_latest_log():
800
  try:
801
- if not os.path.exists(LOG_FILE):
802
- return "No log file found."
803
- with open(LOG_FILE, "r", encoding="utf-8", errors="ignore") as f:
804
- return f.read()
805
  except Exception as e:
806
- return f"Error reading log file: {e}"
 
807
 
808
- css_text = Path(STYLES_CSS).read_text(encoding="utf-8") if os.path.exists(STYLES_CSS) else ""
 
 
 
 
 
809
 
810
- logger.info("Building Gradio UI...")
811
- with gr.Blocks(css=css_text, analytics_enabled=False, title="GhostAI Music Generator") as demo:
812
- gr.Markdown("# 🎵 GhostAI Music Generator")
813
- gr.Markdown("Create instrumental tracks with fixed 30s chunking and seamless merges. Accessibility-first UI.")
814
 
815
- with gr.Row():
816
- with gr.Column():
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
817
  instrumental_prompt = gr.Textbox(
818
  label="Instrumental Prompt",
819
- placeholder="Type your prompt or click a Genre button below",
820
  lines=4,
821
- value=SETTINGS.get("instrumental_prompt", ""),
822
  )
823
 
824
- genre_buttons = []
825
- genre_sections = list_genres()
826
- with gr.Group():
827
- gr.Markdown("### Genres (from prompts.ini)")
828
- for i in range(0, len(genre_sections), 4):
829
- with gr.Row():
830
- for sec in genre_sections[i:i+4]:
831
- btn = gr.Button(_humanize(sec))
832
- genre_buttons.append((btn, sec))
833
-
834
- with gr.Group():
835
- gr.Markdown("### Generation Settings")
836
- cfg_scale = gr.Slider(1.0, 10.0, step=0.1, value=float(SETTINGS.get("cfg_scale", 5.8)), label="CFG Scale")
837
- top_k = gr.Slider(10, 500, step=10, value=int(SETTINGS.get("top_k", 250)), label="Top-K")
838
- top_p = gr.Slider(0.0, 1.0, step=0.01, value=float(SETTINGS.get("top_p", 0.95)), label="Top-P")
839
- temperature = gr.Slider(0.1, 2.0, step=0.01, value=float(SETTINGS.get("temperature", 0.90)), label="Temperature")
840
- total_duration = gr.Dropdown(choices=[30, 60, 90, 120, 180], value=int(SETTINGS.get("total_duration", 60)), label="Song Length (seconds)")
841
-
842
- bpm = gr.Slider(60, 180, step=1, value=int(SETTINGS.get("bpm", 120)), label="Tempo (BPM)")
843
- drum_beat = gr.Dropdown(choices=["none", "standard rock", "techno kick", "funk groove", "jazz swing", "orchestral percussion"], value=str(SETTINGS.get("drum_beat", "none")), label="Drum Beat")
844
- synthesizer = gr.Dropdown(choices=["none", "analog synth", "digital pad", "arpeggiated synth", "strings", "brass", "choir"], value=str(SETTINGS.get("synthesizer", "none")), label="Synthesizer / Section")
845
- rhythmic_steps = gr.Dropdown(choices=["none", "steady steps", "syncopated steps", "complex steps"], value=str(SETTINGS.get("rhythmic_steps", "none")), label="Rhythmic Steps")
846
- bass_style = gr.Dropdown(choices=["none", "slap bass", "deep bass", "melodic bass", "contrabass ostinato"], value=str(SETTINGS.get("bass_style", "none")), label="Bass Style")
847
- guitar_style = gr.Dropdown(choices=["none", "distorted", "clean", "jangle"], value=str(SETTINGS.get("guitar_style", "none")), label="Guitar Style")
848
-
849
- target_volume = gr.Slider(-30.0, -20.0, step=0.5, value=float(SETTINGS.get("target_volume", -23.0)), label="Target Loudness (dBFS RMS)")
850
- preset = gr.Dropdown(choices=["default", "rock", "techno", "grunge", "indie", "funk_rock"], value=str(SETTINGS.get("preset", "default")), label="Preset")
851
- max_steps = gr.Dropdown(choices=[1000, 1200, 1300, 1500], value=int(SETTINGS.get("max_steps", 1500)), label="Max Steps (info)")
852
- bitrate_state = gr.State(value=str(SETTINGS.get("bitrate", "192k")))
853
- sample_rate_state = gr.State(value=str(SETTINGS.get("output_sample_rate", "48000")))
854
- bit_depth_state = gr.State(value=str(SETTINGS.get("bit_depth", "16")))
855
 
856
  with gr.Row():
857
- br128 = gr.Button("Bitrate 128k")
858
- br192 = gr.Button("Bitrate 192k")
859
- br320 = gr.Button("Bitrate 320k")
860
  with gr.Row():
861
- sr22 = gr.Button("SR 22.05k")
862
- sr44 = gr.Button("SR 44.1k")
863
- sr48 = gr.Button("SR 48k")
864
  with gr.Row():
865
- bd16 = gr.Button("16-bit")
866
- bd24 = gr.Button("24-bit")
867
 
 
868
  gen_btn = gr.Button("Generate Music 🚀")
869
  clr_btn = gr.Button("Clear 🧹")
870
  save_btn = gr.Button("Save Settings 💾")
871
  load_btn = gr.Button("Load Settings 📂")
872
  reset_btn = gr.Button("Reset Defaults ♻️")
873
 
874
- with gr.Column():
875
  gr.Markdown("### Output")
876
- out_audio = gr.Audio(label="Generated Track (saved in ./mp3)", type="filepath")
877
  status_box = gr.Textbox(label="Status", interactive=False)
878
  vram_box = gr.Textbox(label="VRAM Usage", interactive=False, value="")
879
- log_btn = gr.Button("View Log 📋")
880
- log_output = gr.Textbox(label="Log Contents", lines=18, interactive=False)
881
 
882
- def on_genre_click(sec, bpm_v, drum_v, synth_v, steps_v, bass_v, guitar_v):
883
- return build_prompt_from_section(sec, bpm_v, drum_v, synth_v, steps_v, bass_v, guitar_v)
 
 
 
 
 
 
 
 
 
 
884
 
885
- for btn, sec in genre_buttons:
886
- btn.click(
887
- on_genre_click,
888
- inputs=[gr.State(sec), bpm, drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style],
889
- outputs=instrumental_prompt
 
 
 
 
 
 
 
 
 
 
 
 
 
 
890
  )
891
 
892
- br128.click(lambda: "128k", outputs=bitrate_state)
893
- br192.click(lambda: "192k", outputs=bitrate_state)
894
- br320.click(lambda: "320k", outputs=bitrate_state)
895
- sr22.click(lambda: "22050", outputs=sample_rate_state)
896
- sr44.click(lambda: "44100", outputs=sample_rate_state)
897
- sr48.click(lambda: "48000", outputs=sample_rate_state)
898
- bd16.click(lambda: "16", outputs=bit_depth_state)
899
- bd24.click(lambda: "24", outputs=bit_depth_state)
900
-
901
- gen_btn.click(
902
- generate_music_wrapper,
903
- inputs=[
904
- instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
905
- drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style,
906
- target_volume, preset, max_steps, vram_box, bitrate_state, sample_rate_state, bit_depth_state
907
- ],
908
- outputs=[out_audio, status_box, vram_box]
909
- )
910
-
911
- def clear_inputs():
912
- s = DEFAULT_SETTINGS.copy()
913
- return (
914
- s["instrumental_prompt"], s["cfg_scale"], s["top_k"], s["top_p"], s["temperature"],
915
- s["total_duration"], s["bpm"], s["drum_beat"], s["synthesizer"], s["rhythmic_steps"],
916
- s["bass_style"], s["guitar_style"], s["target_volume"], s["preset"], s["max_steps"],
917
- s["bitrate"], s["output_sample_rate"], s["bit_depth"]
918
  )
919
- clr_btn.click(
920
- clear_inputs,
921
- outputs=[
922
- instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
923
- drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style, target_volume,
924
- preset, max_steps, bitrate_state, sample_rate_state, bit_depth_state
925
- ]
926
- )
927
 
928
- def _save_action(ip, cs, tk, tp, tt, dur, bpm_v, d, s, rs, b, g, tv, pr, ms, br, sr, bd):
929
- data = {
930
- "instrumental_prompt": ip, "cfg_scale": float(cs), "top_k": int(tk), "top_p": float(tp),
931
- "temperature": float(tt), "total_duration": int(dur), "bpm": int(bpm_v),
932
- "drum_beat": str(d), "synthesizer": str(s), "rhythmic_steps": str(rs),
933
- "bass_style": str(b), "guitar_style": str(g), "target_volume": float(tv),
934
- "preset": str(pr), "max_steps": int(ms), "bitrate": str(br),
935
- "output_sample_rate": str(sr), "bit_depth": str(bd)
936
- }
937
- save_settings(data)
938
- for k, v in data.items(): SETTINGS[k] = v
939
- return " Settings saved."
940
- save_btn.click(
941
- _save_action,
942
- inputs=[instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
943
  drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style, target_volume,
944
- preset, max_steps, bitrate_state, sample_rate_state, bit_depth_state],
945
- outputs=status_box
946
- )
 
947
 
948
- def _load_action():
949
- s = load_settings()
950
- for k, v in s.items(): SETTINGS[k] = v
951
- return (
952
- s["instrumental_prompt"], s["cfg_scale"], s["top_k"], s["top_p"], s["temperature"],
953
- s["total_duration"], s["bpm"], s["drum_beat"], s["synthesizer"], s["rhythmic_steps"],
954
- s["bass_style"], s["guitar_style"], s["target_volume"], s["preset"], s["max_steps"],
955
- s["bitrate"], s["output_sample_rate"], s["bit_depth"],
956
- "✅ Settings loaded."
957
  )
958
- load_btn.click(_load_action,
959
- outputs=[instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
960
- drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style, target_volume,
961
- preset, max_steps, bitrate_state, sample_rate_state, bit_depth_state, status_box]
962
- )
963
 
964
- def _reset_action():
965
- s = DEFAULT_SETTINGS.copy()
966
- save_settings(s)
967
- for k, v in s.items(): SETTINGS[k] = v
968
- return (
969
- s["instrumental_prompt"], s["cfg_scale"], s["top_k"], s["top_p"], s["temperature"],
970
- s["total_duration"], s["bpm"], s["drum_beat"], s["synthesizer"], s["rhythmic_steps"],
971
- s["bass_style"], s["guitar_style"], s["target_volume"], s["preset"], s["max_steps"],
972
- s["bitrate"], s["output_sample_rate"], s["bit_depth"], "✅ Defaults restored."
973
  )
974
- reset_btn.click(_reset_action,
975
- outputs=[instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
976
- drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style, target_volume,
977
- preset, max_steps, bitrate_state, sample_rate_state, bit_depth_state, status_box]
978
- )
979
 
980
- log_btn.click(get_latest_log, outputs=log_output)
981
 
982
- logger.info("Launching Gradio UI at http://0.0.0.0:9999 ...")
 
 
 
 
 
 
 
983
  try:
984
- demo.launch(server_name="0.0.0.0", server_port=9999, share=False, inbrowser=False, show_error=True)
 
 
 
 
 
 
985
  except Exception as e:
986
- logger.error(f"Gradio launch failed: {e}")
987
  logger.error(traceback.format_exc())
988
  sys.exit(1)
 
 
1
  #!/usr/bin/env python3
2
  # -*- coding: utf-8 -*-
3
 
 
7
  import re
8
  import json
9
  import time
 
10
  import mmap
11
+ import math
12
+ import tempfile
13
  import random
14
  import logging
15
  import warnings
16
  import traceback
17
  import subprocess
18
+ import configparser
19
+ from typing import Optional, Tuple, Dict, Any, List
20
+
21
  import numpy as np
22
+ import torch
23
  import torchaudio
24
  import gradio as gr
25
  import gradio_client.utils
 
26
  from pydub import AudioSegment
27
  from datetime import datetime
28
  from pathlib import Path
 
29
  from torch.cuda.amp import autocast
30
 
31
  from fastapi import FastAPI, HTTPException, Query
 
33
  from pydantic import BaseModel
34
  import uvicorn
35
  import threading
36
+
37
+ from colorama import init as colorama_init, Fore, Style
38
+
39
+ # ======================================================================================
40
+ # RELEASE / PATHS
41
+ # ======================================================================================
42
+
43
+ RELEASE = "v1.7.0"
44
+ APP_TITLE = f"GhostAI Music Generator • {RELEASE}"
45
+
46
+ BASE_DIR = Path(__file__).parent.resolve()
47
+ LOG_DIR = BASE_DIR / "logs"
48
+ MP3_DIR = BASE_DIR / "mp3"
49
+ CSS_FILE = BASE_DIR / "styles.css"
50
+ PROMPTS_FILE = BASE_DIR / "prompts.ini"
51
+ EXAMPLE_MD = BASE_DIR / "example.md"
52
+ SETTINGS_FILE = BASE_DIR / "settings.json"
53
+
54
+ LOG_DIR.mkdir(parents=True, exist_ok=True)
55
+ MP3_DIR.mkdir(parents=True, exist_ok=True)
56
 
57
  # ======================================================================================
58
+ # PATCHES & RUNTIME SETUP
59
  # ======================================================================================
60
 
61
  _original_get_type = gradio_client.utils.get_type
 
70
  torch.backends.cudnn.benchmark = False
71
  torch.backends.cudnn.deterministic = True
72
 
73
+ # ======================================================================================
74
+ # LOGGING (SINGLE FILE, MAX 5MB, AUTO-TRIM)
75
+ # ======================================================================================
76
+
77
+ colorama_init(autoreset=True)
78
 
79
+ LOG_FILE = LOG_DIR / "musicgen.log"
80
+ MAX_LOG_BYTES = 5 * 1024 * 1024 # 5 MB
81
+
82
+ class TrimmingFileHandler(logging.FileHandler):
83
+ def emit(self, record):
84
+ try:
85
+ super().emit(record)
86
+ self._trim_if_needed()
87
+ except Exception:
88
+ pass
89
+
90
+ def _trim_if_needed(self):
91
+ try:
92
+ if self.stream:
93
+ self.stream.flush()
94
+ size = LOG_FILE.stat().st_size if LOG_FILE.exists() else 0
95
+ if size <= MAX_LOG_BYTES:
96
+ return
97
+ keep = int(1.5 * 1024 * 1024)
98
+ with open(LOG_FILE, "rb") as f:
99
+ if size > keep:
100
+ f.seek(-keep, 2)
101
+ tail = f.read()
102
+ else:
103
+ tail = f.read()
104
+ with open(LOG_FILE, "wb") as f:
105
+ f.write(b"[log trimmed]\n")
106
+ f.write(tail)
107
+ except Exception:
108
+ pass
109
+
110
+ class ColorFormatter(logging.Formatter):
111
+ COLORS = {
112
+ "DEBUG": Fore.BLUE,
113
+ "INFO": Fore.GREEN,
114
+ "WARNING": Fore.YELLOW,
115
+ "ERROR": Fore.RED,
116
+ "CRITICAL": Fore.RED + Style.BRIGHT,
117
+ }
118
+ def format(self, record):
119
+ levelname = record.levelname
120
+ color = self.COLORS.get(levelname, "")
121
+ reset = Style.RESET_ALL
122
+ record.levelname = f"{color}{levelname}{reset}"
123
+ return super().format(record)
124
+
125
+ console_handler = logging.StreamHandler(sys.stdout)
126
+ console_handler.setLevel(logging.DEBUG)
127
+ console_handler.setFormatter(ColorFormatter("%(asctime)s [%(levelname)s] %(message)s"))
128
+
129
+ file_handler = TrimmingFileHandler(LOG_FILE, mode="a", encoding="utf-8", delay=False)
130
+ file_handler.setLevel(logging.DEBUG)
131
  file_handler.setFormatter(logging.Formatter("%(asctime)s [%(levelname)s] %(message)s"))
132
+
133
+ logging.basicConfig(level=logging.DEBUG, handlers=[console_handler, file_handler])
134
+ logger = logging.getLogger("ghostai-musicgen")
135
+ logger.info(f"Starting GhostAI Music Generator {RELEASE}")
136
+
137
+ # ======================================================================================
138
+ # DEVICE
139
+ # ======================================================================================
140
 
141
  DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
142
  if DEVICE != "cuda":
143
+ logger.error("CUDA is required. Exiting.")
144
  sys.exit(1)
145
  logger.info(f"GPU: {torch.cuda.get_device_name(0)}")
146
+ logger.info("Precision: fp16 model, fp32 CPU audio ops")
147
 
148
  # ======================================================================================
149
  # SETTINGS PERSISTENCE
150
  # ======================================================================================
151
 
 
 
 
 
152
  DEFAULT_SETTINGS: Dict[str, Any] = {
153
  "cfg_scale": 5.8,
154
  "top_k": 250,
 
170
  "instrumental_prompt": ""
171
  }
172
 
173
+ def load_settings_from_file() -> Dict[str, Any]:
174
  try:
175
+ if SETTINGS_FILE.exists():
176
+ with open(SETTINGS_FILE, "r", encoding="utf-8") as f:
177
  data = json.load(f)
178
  for k, v in DEFAULT_SETTINGS.items():
179
  data.setdefault(k, v)
180
  logger.info(f"Loaded settings from {SETTINGS_FILE}")
181
  return data
182
  except Exception as e:
183
+ logger.error(f"Failed reading {SETTINGS_FILE}: {e}")
184
  return DEFAULT_SETTINGS.copy()
185
 
186
+ def save_settings_to_file(settings: Dict[str, Any]) -> None:
187
  try:
188
+ with open(SETTINGS_FILE, "w", encoding="utf-8") as f:
189
+ json.dump(settings, f, indent=2)
190
  logger.info(f"Saved settings to {SETTINGS_FILE}")
191
  except Exception as e:
192
+ logger.error(f"Failed saving {SETTINGS_FILE}: {e}")
 
 
 
 
 
 
193
 
194
+ CURRENT_SETTINGS = load_settings_from_file()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
195
 
196
  # ======================================================================================
197
+ # VRAM / DISK / MEMORY
198
  # ======================================================================================
199
 
200
  def clean_memory() -> Optional[float]:
 
203
  gc.collect()
204
  torch.cuda.ipc_collect()
205
  torch.cuda.synchronize()
206
+ vram_mb = torch.cuda.memory_allocated() / 1024**2
207
+ logger.debug(f"Memory cleaned. VRAM={vram_mb:.2f} MB")
208
+ return vram_mb
209
  except Exception as e:
210
  logger.error(f"clean_memory failed: {e}")
211
+ logger.error(traceback.format_exc())
212
  return None
213
 
214
  def check_vram():
 
222
  used_mb, total_mb = map(int, re.findall(r'\d+', lines[1]))
223
  free_mb = total_mb - used_mb
224
  logger.info(f"VRAM: used {used_mb} MiB | free {free_mb} MiB | total {total_mb} MiB")
225
+ if free_mb < 5000:
226
+ logger.warning(f"Low free VRAM ({free_mb} MiB). Running processes:")
227
+ procs = subprocess.run(
228
+ ['nvidia-smi', '--query-compute-apps=pid,used_memory', '--format=csv'],
229
+ capture_output=True, text=True
230
+ )
231
+ logger.info(f"\n{procs.stdout}")
232
  return free_mb
233
  except Exception as e:
234
  logger.error(f"check_vram failed: {e}")
 
246
  return False
247
 
248
  # ======================================================================================
249
+ # AUDIO UTILS (CPU)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
250
  # ======================================================================================
251
 
252
+ def ensure_stereo(audio_segment: AudioSegment, sample_rate=48000, sample_width=2) -> AudioSegment:
253
  try:
254
+ if audio_segment.channels != 2:
255
+ audio_segment = audio_segment.set_channels(2)
256
+ if audio_segment.frame_rate != sample_rate:
257
+ audio_segment = audio_segment.set_frame_rate(sample_rate)
258
+ return audio_segment
259
+ except Exception as e:
260
+ logger.error(f"ensure_stereo failed: {e}")
261
+ return audio_segment
262
 
263
+ def calculate_rms(segment: AudioSegment) -> float:
264
  try:
265
+ samples = np.array(segment.get_array_of_samples(), dtype=np.float32)
266
  return float(np.sqrt(np.mean(samples**2)))
267
+ except Exception as e:
268
+ logger.error(f"calculate_rms failed: {e}")
269
  return 0.0
270
 
271
+ def hard_limit(audio_segment: AudioSegment, limit_db=-3.0, sample_rate=48000) -> AudioSegment:
272
  try:
273
+ audio_segment = ensure_stereo(audio_segment, sample_rate, audio_segment.sample_width)
274
+ limit = 10 ** (limit_db / 20.0) * (2**23 if audio_segment.sample_width == 3 else 32767)
275
+ samples = np.array(audio_segment.get_array_of_samples(), dtype=np.float32)
276
+ samples = np.clip(samples, -limit, limit).astype(np.int32 if audio_segment.sample_width == 3 else np.int16)
277
+ if len(samples) % 2 != 0:
278
+ samples = samples[:-1]
279
+ return AudioSegment(
280
+ samples.tobytes(),
281
+ frame_rate=sample_rate,
282
+ sample_width=audio_segment.sample_width,
283
+ channels=2
284
+ )
285
+ except Exception as e:
286
+ logger.error(f"hard_limit failed: {e}")
287
+ return audio_segment
288
 
289
+ def rms_normalize(segment: AudioSegment, target_rms_db=-23.0, peak_limit_db=-3.0, sample_rate=48000) -> AudioSegment:
290
  try:
291
+ segment = ensure_stereo(segment, sample_rate, segment.sample_width)
292
+ target_rms = 10 ** (target_rms_db / 20) * (2**23 if segment.sample_width == 3 else 32767)
293
+ current_rms = calculate_rms(segment)
294
+ if current_rms > 0:
295
+ gain_factor = target_rms / current_rms
296
+ segment = segment.apply_gain(20 * np.log10(max(gain_factor, 1e-6)))
297
+ segment = hard_limit(segment, limit_db=peak_limit_db, sample_rate=sample_rate)
298
+ return segment
299
+ except Exception as e:
300
+ logger.error(f"rms_normalize failed: {e}")
301
+ return segment
302
 
303
+ def balance_stereo(audio_segment: AudioSegment, noise_threshold=-40, sample_rate=48000) -> AudioSegment:
304
  try:
305
+ audio_segment = ensure_stereo(audio_segment, sample_rate, audio_segment.sample_width)
306
+ samples = np.array(audio_segment.get_array_of_samples(), dtype=np.float32)
307
+ if audio_segment.channels != 2:
308
+ return audio_segment
309
+ stereo = samples.reshape(-1, 2)
310
  db = 20 * np.log10(np.abs(stereo) + 1e-10)
311
  mask = db > noise_threshold
312
  stereo = stereo * mask
313
+ left = stereo[:, 0]
314
+ right = stereo[:, 1]
315
+ l_rms = np.sqrt(np.mean(left[left != 0] ** 2)) if np.any(left != 0) else 0
316
+ r_rms = np.sqrt(np.mean(right[right != 0] ** 2)) if np.any(right != 0) else 0
317
  if l_rms > 0 and r_rms > 0:
318
  avg = (l_rms + r_rms) / 2
319
  stereo[:, 0] *= (avg / l_rms)
320
  stereo[:, 1] *= (avg / r_rms)
321
+ out = stereo.flatten().astype(np.int32 if audio_segment.sample_width == 3 else np.int16)
322
  if len(out) % 2 != 0:
323
  out = out[:-1]
324
+ return AudioSegment(
325
+ out.tobytes(),
326
+ frame_rate=sample_rate,
327
+ sample_width=audio_segment.sample_width,
328
+ channels=2
329
+ )
330
+ except Exception as e:
331
+ logger.error(f"balance_stereo failed: {e}")
332
+ return audio_segment
333
 
334
+ def apply_noise_gate(audio_segment: AudioSegment, threshold_db=-80, sample_rate=48000) -> AudioSegment:
335
  try:
336
+ audio_segment = ensure_stereo(audio_segment, sample_rate, audio_segment.sample_width)
337
+ samples = np.array(audio_segment.get_array_of_samples(), dtype=np.float32)
338
+ if audio_segment.channels != 2:
339
+ return audio_segment
340
+ stereo = samples.reshape(-1, 2)
341
  for _ in range(2):
342
  db = 20 * np.log10(np.abs(stereo) + 1e-10)
343
  mask = db > threshold_db
344
  stereo = stereo * mask
345
+ out = stereo.flatten().astype(np.int32 if audio_segment.sample_width == 3 else np.int16)
346
  if len(out) % 2 != 0:
347
  out = out[:-1]
348
+ return AudioSegment(
349
+ out.tobytes(),
350
+ frame_rate=sample_rate,
351
+ sample_width=audio_segment.sample_width,
352
+ channels=2
353
+ )
354
+ except Exception as e:
355
+ logger.error(f"apply_noise_gate failed: {e}")
356
+ return audio_segment
357
 
358
+ def apply_eq(segment: AudioSegment, sample_rate=48000) -> AudioSegment:
359
  try:
360
+ segment = ensure_stereo(segment, sample_rate, segment.sample_width)
361
+ segment = segment.high_pass_filter(20)
362
+ segment = segment.low_pass_filter(8000)
363
+ segment = segment - 3
364
+ segment = segment - 3
365
+ segment = segment - 10
366
+ return segment
367
+ except Exception as e:
368
+ logger.error(f"apply_eq failed: {e}")
369
+ return segment
370
 
371
+ def apply_fade(segment: AudioSegment, fade_in_duration=500, fade_out_duration=800) -> AudioSegment:
372
  try:
373
+ segment = ensure_stereo(segment, segment.frame_rate, segment.sample_width)
374
+ segment = segment.fade_in(fade_in_duration).fade_out(fade_out_duration)
375
+ return segment
376
+ except Exception as e:
377
+ logger.error(f"apply_fade failed: {e}")
378
+ return segment
379
 
380
+ # ======================================================================================
381
+ # PROMPTS.INI LOADING / VARIABLE PROMPT BUILDER
382
+ # ======================================================================================
383
+
384
+ def _csv(v: str) -> List[str]:
385
+ if not v or v.strip().lower() == "none":
386
+ return []
387
+ return [x.strip() for x in v.split(",") if x.strip()]
388
+
389
+ def load_profiles_from_ini(prompts_file: Path) -> Dict[str, Dict[str, Any]]:
390
+ if not prompts_file.exists():
391
+ raise FileNotFoundError(f"Required prompts file missing: {prompts_file}")
392
+ cfg = configparser.ConfigParser()
393
+ cfg.read(prompts_file, encoding="utf-8")
394
+ profiles: Dict[str, Dict[str, Any]] = {}
395
+ for sect in cfg.sections():
396
+ s = cfg[sect]
397
+ profiles[sect] = {
398
+ "label": s.get("label", sect.replace("_", " ").title()),
399
+ "bpm_min": s.getint("bpm_min", 100),
400
+ "bpm_max": s.getint("bpm_max", 140),
401
+ "drum_beat": _csv(s.get("drum_beat", "none")),
402
+ "synthesizer": _csv(s.get("synthesizer", "none")),
403
+ "rhythmic_steps": _csv(s.get("rhythmic_steps", "steady steps")),
404
+ "bass_style": _csv(s.get("bass_style", "melodic bass")),
405
+ "guitar_style": _csv(s.get("guitar_style", "clean")),
406
+ "mood": _csv(s.get("mood", "energetic")),
407
+ "structure": _csv(s.get("structure", "intro,verse,chorus,outro")),
408
+ "api_name": s.get("api_name", f"/set_{sect}_prompt"),
409
+ "prompt_template": s.get(
410
+ "prompt_template",
411
+ "Instrumental track {guitar}{bass}{drum}{synth}{rhythm}, {mood} {section} at {bpm} BPM."
412
+ ),
413
+ }
414
+ if not profiles:
415
+ raise RuntimeError("No profiles found in prompts.ini")
416
+ return profiles
417
+
418
+ def rand_choice(lst: List[str], fallback: str = "") -> str:
419
+ if not lst:
420
+ return fallback
421
+ return random.choice(lst)
422
+
423
+ def assemble_prompt(profiles: Dict[str, Dict[str, Any]], style_key: str, bpm_hint: int, chunk_idx: int) -> str:
424
+ prof = profiles.get(style_key)
425
+ if not prof:
426
+ return "Instrumental track, energetic, intro at 120 BPM."
427
+ bpm_min, bpm_max = prof["bpm_min"], prof["bpm_max"]
428
+ bpm = bpm_hint if bpm_hint != 120 else random.randint(bpm_min, bpm_max)
429
+ drum = rand_choice(prof["drum_beat"])
430
+ synth = rand_choice(prof["synthesizer"])
431
+ rhythm = rand_choice(prof["rhythmic_steps"])
432
+ bass = rand_choice(prof["bass_style"])
433
+ guitar = rand_choice(prof["guitar_style"])
434
+ mood = rand_choice(prof["mood"], "dynamic")
435
+
436
+ struct = prof["structure"] or ["intro", "verse", "chorus", "outro"]
437
+ if chunk_idx <= 1:
438
+ section = struct[0] if struct else "intro"
439
+ else:
440
+ section = rand_choice(struct[1:]) if len(struct) > 1 else "chorus"
441
+
442
+ def fmt(val, suffix=""):
443
+ if not val or val == "none":
444
+ return ""
445
+ return f", {val}{suffix}"
446
+
447
+ template = prof["prompt_template"]
448
+ prompt = template.format(
449
+ bpm=bpm,
450
+ drum=fmt(drum, " drums"),
451
+ synth=fmt(synth),
452
+ rhythm=fmt(rhythm),
453
+ bass=fmt(bass + " bass" if bass and "bass" not in bass else bass),
454
+ guitar=fmt(guitar + " guitar" if guitar and "guitar" not in guitar else guitar),
455
+ mood=mood,
456
+ section=section,
457
+ )
458
+ return prompt
459
+
460
+ # ======================================================================================
461
+ # PRESETS
462
+ # ======================================================================================
463
+
464
+ PRESETS = {
465
+ "default": {"cfg_scale": 5.8, "top_k": 250, "top_p": 0.95, "temperature": 0.90},
466
+ "rock": {"cfg_scale": 5.8, "top_k": 250, "top_p": 0.95, "temperature": 0.90},
467
+ "techno": {"cfg_scale": 5.2, "top_k": 300, "top_p": 0.96, "temperature": 0.95},
468
+ "grunge": {"cfg_scale": 6.2, "top_k": 220, "top_p": 0.94, "temperature": 0.90},
469
+ "indie": {"cfg_scale": 5.5, "top_k": 240, "top_p": 0.95, "temperature": 0.92},
470
+ "funk_rock": {"cfg_scale": 5.8, "top_k": 260, "top_p": 0.96, "temperature": 0.94},
471
+ }
472
+
473
+ # ======================================================================================
474
+ # MODEL LOAD
475
+ # ======================================================================================
476
+
477
+ try:
478
+ from audiocraft.models import MusicGen
479
+ except Exception as e:
480
+ logger.error("audiocraft is required. pip install audiocraft")
481
+ raise
482
+
483
+ def load_model():
484
+ free_vram = check_vram()
485
+ if free_vram is not None and free_vram < 5000:
486
+ logger.warning("Low free VRAM; consider closing other apps.")
487
+ clean_memory()
488
+ local_model_path = str(BASE_DIR / "models" / "musicgen-large")
489
+ if not os.path.exists(local_model_path):
490
+ logger.error(f"Model path missing: {local_model_path}")
491
+ sys.exit(1)
492
+ logger.info("Loading MusicGen (large)...")
493
+ with autocast(dtype=torch.float16):
494
+ model = MusicGen.get_pretrained(local_model_path, device=DEVICE)
495
+ model.set_generation_params(duration=30, two_step_cfg=False)
496
+ logger.info("MusicGen loaded.")
497
+ return model
498
+
499
+ musicgen_model = load_model()
500
+
501
+ # ======================================================================================
502
+ # GENERATION PIPELINE
503
+ # ======================================================================================
504
+
505
+ def _export_torch_to_segment(audio_tensor: torch.Tensor, sample_rate: int, bit_depth_int: int) -> Optional[AudioSegment]:
506
+ with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as tmp:
507
+ tmp_path = tmp.name
508
  try:
509
+ torchaudio.save(tmp_path, audio_tensor, sample_rate, bits_per_sample=bit_depth_int)
510
  with open(tmp_path, "rb") as f:
511
  mm = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
512
  seg = AudioSegment.from_wav(tmp_path)
513
  mm.close()
514
  return seg
515
  except Exception as e:
516
+ logger.error(f"_export_torch_to_segment failed: {e}")
517
+ logger.error(traceback.format_exc())
518
  return None
519
  finally:
520
  try:
521
+ os.remove(tmp_path)
522
  except OSError:
523
  pass
524
 
525
+ def _crossfade_segments(seg_a: AudioSegment, seg_b: AudioSegment, overlap_ms: int, sample_rate: int, bit_depth_int: int) -> AudioSegment:
526
  try:
527
+ seg_a = ensure_stereo(seg_a, sample_rate, seg_a.sample_width)
528
+ seg_b = ensure_stereo(seg_b, sample_rate, seg_b.sample_width)
529
  if overlap_ms <= 0 or len(seg_a) < overlap_ms or len(seg_b) < overlap_ms:
530
  return seg_a + seg_b
531
 
532
+ with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as prev_wav, \
533
+ tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as curr_wav:
534
+ prev_path, curr_path = prev_wav.name, curr_wav.name
535
+
536
+ try:
537
+ seg_a[-overlap_ms:].export(prev_path, format="wav")
538
+ seg_b[:overlap_ms].export(curr_path, format="wav")
539
+ a_audio, sr_a = torchaudio.load(prev_path)
540
+ b_audio, sr_b = torchaudio.load(curr_path)
541
+ if sr_a != sample_rate:
542
+ a_audio = torchaudio.functional.resample(a_audio, sr_a, sample_rate, lowpass_filter_width=64)
543
+ if sr_b != sample_rate:
544
+ b_audio = torchaudio.functional.resample(b_audio, sr_b, sample_rate, lowpass_filter_width=64)
545
+ n = min(a_audio.shape[1], b_audio.shape[1])
546
+ n = n - (n % 2)
547
+ if n <= 0:
548
+ return seg_a + seg_b
549
+ a = a_audio[:, :n]
550
+ b = b_audio[:, :n]
551
+ hann = torch.hann_window(n, periodic=False)
552
+ fade_in = hann
553
+ fade_out = hann.flip(0)
554
+ blended = (a * fade_out + b * fade_in).to(torch.float32)
555
+ blended = torch.clamp(blended, -1.0, 1.0)
556
+
557
+ scale = (2**23 if bit_depth_int == 24 else 32767)
558
+ blended_i = (blended * scale).to(torch.int32 if bit_depth_int == 24 else torch.int16)
559
+
560
+ with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as tmp_x:
561
+ temp_x = tmp_x.name
562
+ torchaudio.save(temp_x, blended_i, sample_rate, bits_per_sample=bit_depth_int)
563
+ blended_seg = AudioSegment.from_wav(temp_x)
564
+ blended_seg = ensure_stereo(blended_seg, sample_rate, blended_seg.sample_width)
565
+
566
+ result = seg_a[:-overlap_ms] + blended_seg + seg_b[overlap_ms:]
567
+ return result
568
+ finally:
569
+ for p in [prev_path, curr_path, locals().get("temp_x", None)]:
570
  try:
571
+ if p and os.path.exists(p):
572
+ os.remove(p)
573
  except OSError:
574
  pass
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
575
  except Exception as e:
576
+ logger.error(f"_crossfade_segments failed: {e}")
577
  return seg_a + seg_b
578
 
 
 
 
 
579
  def generate_music(
580
  instrumental_prompt: str,
581
  cfg_scale: float,
 
591
  guitar_style: str,
592
  target_volume: float,
593
  preset: str,
594
+ max_steps: str,
595
  vram_status_text: str,
596
  bitrate: str,
597
  output_sample_rate: str,
598
  bit_depth: str
599
  ) -> Tuple[Optional[str], str, str]:
600
+ global musicgen_model
601
+
602
  if not instrumental_prompt or not instrumental_prompt.strip():
603
+ return None, "⚠️ Please enter a valid instrumental prompt!", vram_status_text
604
 
605
  try:
606
+ if preset != "default":
607
+ p = PRESETS.get(preset, PRESETS["default"])
608
+ cfg_scale, top_k, top_p, temperature = p["cfg_scale"], p["top_k"], p["top_p"], p["temperature"]
609
+ logger.info(f"Preset '{preset}' applied: cfg={cfg_scale} top_k={top_k} top_p={top_p} temp={temperature}")
610
+
611
+ try:
612
+ output_sr_int = int(output_sample_rate)
613
+ except:
614
+ return None, "❌ Invalid output sampling rate; choose 22050/44100/48000", vram_status_text
615
+ try:
616
+ bit_depth_int = int(bit_depth)
617
+ sample_width = 3 if bit_depth_int == 24 else 2
618
+ except:
619
+ return None, "❌ Invalid bit depth; choose 16 or 24", vram_status_text
620
+
621
+ if not check_disk_space():
622
+ return None, "⚠️ Low disk space (<1GB).", vram_status_text
623
+
624
+ CHUNK_SEC = 30
625
+ total_duration = max(30, min(int(total_duration), 120))
626
+ num_chunks = math.ceil(total_duration / CHUNK_SEC)
627
+
628
+ PROCESS_SR = 48000
629
+ OVERLAP_SEC = 0.20
630
+
631
+ seed = random.randint(0, 2**31 - 1)
632
+ random.seed(seed)
633
+ torch.manual_seed(seed)
634
+ np.random.seed(seed)
635
+ torch.cuda.manual_seed_all(seed)
636
+
637
+ musicgen_model.set_generation_params(
638
+ duration=CHUNK_SEC,
639
+ use_sampling=True,
640
+ top_k=int(top_k),
641
+ top_p=float(top_p),
642
+ temperature=float(temperature),
643
+ cfg_coef=float(cfg_scale),
644
+ two_step_cfg=False,
645
+ )
646
 
647
+ vram_status_text = f"Start VRAM: {torch.cuda.memory_allocated() / 1024**2:.2f} MB"
 
648
 
649
+ segments: List[AudioSegment] = []
650
+ start_time = time.time()
 
651
 
652
+ for idx in range(num_chunks):
653
+ chunk_idx = idx + 1
654
+ dur = CHUNK_SEC if (idx < num_chunks - 1) else (total_duration - CHUNK_SEC * (num_chunks - 1) or CHUNK_SEC)
655
+ logger.info(f"Generating chunk {chunk_idx}/{num_chunks} ({dur}s)")
 
656
 
657
+ prompt_text = instrumental_prompt # UI sends fully-assembled prompt (or manual text)
658
+
659
+ try:
660
+ with torch.no_grad():
661
+ with autocast(dtype=torch.float16):
662
+ clean_memory()
663
+ if idx == 0:
664
+ audio = musicgen_model.generate([prompt_text], progress=True)[0].cpu()
665
+ else:
666
+ prev_seg = segments[-1]
667
+ prev_seg = apply_noise_gate(prev_seg, threshold_db=-80, sample_rate=PROCESS_SR)
668
+ prev_seg = balance_stereo(prev_seg, noise_threshold=-40, sample_rate=PROCESS_SR)
669
+ with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as tmp_prev:
670
+ temp_prev = tmp_prev.name
671
+ try:
672
+ prev_seg.export(temp_prev, format="wav")
673
+ prev_audio, prev_sr = torchaudio.load(temp_prev)
674
+ if prev_sr != PROCESS_SR:
675
+ prev_audio = torchaudio.functional.resample(prev_audio, prev_sr, PROCESS_SR, lowpass_filter_width=64)
676
+ if prev_audio.shape[0] != 2:
677
+ prev_audio = prev_audio.repeat(2, 1)[:, :prev_audio.shape[1]]
678
+ prev_audio = prev_audio.to(DEVICE)
679
+ tail = prev_audio[:, -int(PROCESS_SR * OVERLAP_SEC):]
680
+
681
+ audio = musicgen_model.generate_continuation(
682
+ prompt=tail,
683
+ prompt_sample_rate=PROCESS_SR,
684
+ descriptions=[prompt_text],
685
+ progress=True
686
+ )[0].cpu()
687
+ del prev_audio, tail
688
+ finally:
689
+ try:
690
+ if os.path.exists(temp_prev):
691
+ os.remove(temp_prev)
692
+ except OSError:
693
+ pass
694
+ clean_memory()
695
+ except Exception as e:
696
+ logger.error(f"Chunk {chunk_idx} generation failed: {e}")
697
+ logger.error(traceback.format_exc())
698
+ return None, f"❌ Failed to generate chunk {chunk_idx}: {e}", vram_status_text
699
 
700
+ try:
701
+ if audio.shape[0] != 2:
702
+ audio = audio.repeat(2, 1)[:, :audio.shape[1]]
703
+ audio = audio.to(dtype=torch.float32)
704
+ audio = torchaudio.functional.resample(audio, 32000, PROCESS_SR, lowpass_filter_width=64)
705
+ seg = _export_torch_to_segment(audio, PROCESS_SR, bit_depth_int)
706
+ if seg is None:
707
+ return None, f"❌ Failed to convert audio for chunk {chunk_idx}", vram_status_text
708
+ seg = ensure_stereo(seg, PROCESS_SR, sample_width)
709
+ seg = seg - 15
710
+ seg = apply_noise_gate(seg, threshold_db=-80, sample_rate=PROCESS_SR)
711
+ seg = balance_stereo(seg, noise_threshold=-40, sample_rate=PROCESS_SR)
712
+ seg = rms_normalize(seg, target_rms_db=target_volume, peak_limit_db=-3.0, sample_rate=PROCESS_SR)
713
+ seg = apply_eq(seg, sample_rate=PROCESS_SR)
714
+ seg = seg[:dur * 1000]
715
+ segments.append(seg)
716
+ del audio
717
+ clean_memory()
718
+ vram_status_text = f"VRAM after chunk {chunk_idx}: {torch.cuda.memory_allocated() / 1024**2:.2f} MB"
719
+ except Exception as e:
720
+ logger.error(f"Post-processing failed (chunk {chunk_idx}): {e}")
721
+ logger.error(traceback.format_exc())
722
+ return None, f"❌ Failed to process chunk {chunk_idx}: {e}", vram_status_text
723
+
724
+ if not segments:
725
+ return None, "❌ No audio generated.", vram_status_text
726
+
727
+ logger.info("Combining chunks...")
728
+ final_seg = segments[0]
729
+ overlap_ms = int(0.20 * 1000)
730
+ for i in range(1, len(segments)):
731
+ final_seg = _crossfade_segments(final_seg, segments[i], overlap_ms, PROCESS_SR, bit_depth_int)
732
+
733
+ final_seg = final_seg[:total_duration * 1000]
734
+
735
+ final_seg = apply_noise_gate(final_seg, threshold_db=-80, sample_rate=PROCESS_SR)
736
+ final_seg = balance_stereo(final_seg, noise_threshold=-40, sample_rate=PROCESS_SR)
737
+ final_seg = rms_normalize(final_seg, target_rms_db=target_volume, peak_limit_db=-3.0, sample_rate=PROCESS_SR)
738
+ final_seg = apply_eq(final_seg, sample_rate=PROCESS_SR)
739
+ final_seg = apply_fade(final_seg, 500, 800)
740
+ final_seg = final_seg - 10
741
+ final_seg = final_seg.set_frame_rate(output_sr_int)
742
+
743
+ mp3_path = MP3_DIR / f"ghostai_music_{int(time.time())}.mp3"
744
  try:
745
+ clean_memory()
746
+ final_seg.export(str(mp3_path), format="mp3", bitrate=bitrate, tags={"title": "GhostAI Instrumental", "artist": "GhostAI"})
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
747
  except Exception as e:
748
+ logger.error(f"MP3 export failed ({bitrate}): {e}")
749
+ fb = MP3_DIR / f"ghostai_music_fallback_{int(time.time())}.mp3"
750
+ try:
751
+ final_seg.export(str(fb), format="mp3", bitrate="128k")
752
+ mp3_path = fb
753
+ except Exception as ee:
754
+ return None, f" Failed to export MP3: {ee}", vram_status_text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
755
 
756
+ elapsed = time.time() - start_time
757
+ vram_status_text = f"Final VRAM: {torch.cuda.memory_allocated() / 1024**2:.2f} MB"
758
+ logger.info(f"Done in {elapsed:.2f}s -> {mp3_path}")
759
+ return str(mp3_path), "✅ Done! 30s chunking unified seamlessly. Check output loudness/quality.", vram_status_text
760
 
761
+ except Exception as e:
762
+ logger.error(f"Generation failed: {e}")
763
+ logger.error(traceback.format_exc())
764
+ return None, f"❌ Generation failed: {e}", vram_status_text
765
  finally:
766
  clean_memory()
767
 
768
+ def clear_inputs():
769
+ s = DEFAULT_SETTINGS.copy()
770
+ return (
771
+ s["instrumental_prompt"], s["cfg_scale"], s["top_k"], s["top_p"], s["temperature"],
772
+ s["total_duration"], s["bpm"], s["drum_beat"], s["synthesizer"], s["rhythmic_steps"],
773
+ s["bass_style"], s["guitar_style"], s["target_volume"], s["preset"], s["max_steps"],
774
+ s["bitrate"], s["output_sample_rate"], s["bit_depth"]
775
+ )
776
+
777
  # ======================================================================================
778
+ # SERVER STATUS (BUSY/IDLE) & RENDER API & STYLE PROMPT API
779
  # ======================================================================================
780
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
781
  BUSY_LOCK = threading.Lock()
782
  BUSY_FLAG = False
783
+ BUSY_FILE = "/tmp/musicgen_busy.lock"
784
  CURRENT_JOB: Dict[str, Any] = {"id": None, "start": None}
785
 
786
  def set_busy(val: bool, job_id: Optional[str] = None):
 
790
  if val:
791
  CURRENT_JOB["id"] = job_id or f"job_{int(time.time())}"
792
  CURRENT_JOB["start"] = time.time()
793
+ try:
794
+ Path(BUSY_FILE).write_text(CURRENT_JOB["id"])
795
+ except Exception:
796
+ pass
797
  else:
798
  CURRENT_JOB["id"] = None
799
  CURRENT_JOB["start"] = None
800
+ try:
801
+ if os.path.exists(BUSY_FILE):
802
+ os.remove(BUSY_FILE)
803
+ except Exception:
804
+ pass
805
 
806
  def is_busy() -> bool:
807
  with BUSY_LOCK:
 
813
  return 0.0
814
  return time.time() - CURRENT_JOB["start"]
815
 
816
+ class RenderRequest(BaseModel):
817
+ instrumental_prompt: str
818
+ cfg_scale: Optional[float] = None
819
+ top_k: Optional[int] = None
820
+ top_p: Optional[float] = None
821
+ temperature: Optional[float] = None
822
+ total_duration: Optional[int] = None
823
+ bpm: Optional[int] = None
824
+ drum_beat: Optional[str] = None
825
+ synthesizer: Optional[str] = None
826
+ rhythmic_steps: Optional[str] = None
827
+ bass_style: Optional[str] = None
828
+ guitar_style: Optional[str] = None
829
+ target_volume: Optional[float] = None
830
+ preset: Optional[str] = None
831
+ max_steps: Optional[int] = None
832
+ bitrate: Optional[str] = None
833
+ output_sample_rate: Optional[str] = None
834
+ bit_depth: Optional[str] = None
835
+
836
+ class SettingsUpdate(BaseModel):
837
+ settings: Dict[str, Any]
838
+
839
+ fastapp = FastAPI(title=f"GhostAI Music Server {RELEASE}", version=RELEASE)
840
  fastapp.add_middleware(
841
+ CORSMiddleware,
842
+ allow_origins=["*"], allow_credentials=True, allow_methods=["*"], allow_headers=["*"],
843
  )
844
 
845
  @fastapp.get("/health")
846
  def health():
847
+ return {"ok": True, "ts": int(time.time()), "release": RELEASE}
848
 
849
  @fastapp.get("/status")
850
  def status():
851
+ busy = is_busy()
852
+ return {
853
+ "busy": busy,
854
+ "job_id": CURRENT_JOB["id"],
855
+ "since": CURRENT_JOB["start"],
856
+ "elapsed": job_elapsed(),
857
+ "lockfile": os.path.exists(BUSY_FILE),
858
+ "release": RELEASE
859
+ }
860
 
861
  @fastapp.get("/config")
862
  def get_config():
863
+ return {"defaults": CURRENT_SETTINGS, "release": RELEASE}
864
 
865
  @fastapp.post("/settings")
866
  def set_settings(payload: SettingsUpdate):
867
  try:
868
+ s = CURRENT_SETTINGS.copy()
869
  s.update(payload.settings or {})
870
+ save_settings_to_file(s)
871
  for k, v in s.items():
872
+ CURRENT_SETTINGS[k] = v
873
  return {"ok": True, "saved": s}
874
  except Exception as e:
875
  raise HTTPException(status_code=400, detail=str(e))
876
 
877
+ def register_style_endpoints(app: FastAPI, profiles: Dict[str, Dict[str, Any]]):
878
+ for key, prof in profiles.items():
879
+ route = prof.get("api_name") or f"/set_{key}_prompt"
880
+ async def style_endpoint(style_key=key):
881
+ return {"style": style_key, "prompt": assemble_prompt(profiles, style_key, 120, 1), "release": RELEASE}
882
+ app.add_api_route(route, style_endpoint, methods=["GET"])
883
+
884
+ @fastapp.get("/styles")
885
+ def list_styles():
886
+ return {
887
+ "styles": [
888
+ {"key": k, "label": v["label"], "api_name": v["api_name"]}
889
+ for k, v in PROFILES.items()
890
+ ],
891
+ "release": RELEASE
892
+ }
893
 
894
+ @fastapp.get("/prompt")
895
+ def get_prompt(style: str = Query(...), bpm: int = Query(120), chunk: int = Query(1)):
896
+ if style not in PROFILES:
897
+ raise HTTPException(status_code=404, detail=f"Unknown style '{style}'")
898
+ return {"style": style, "prompt": assemble_prompt(PROFILES, style, bpm, chunk), "release": RELEASE}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
899
 
900
  @fastapp.post("/render")
901
  def render(req: RenderRequest):
 
904
  job_id = f"render_{int(time.time())}"
905
  set_busy(True, job_id)
906
  try:
907
+ s = CURRENT_SETTINGS.copy()
908
  for k, v in req.dict().items():
909
  if v is not None:
910
  s[k] = v
 
931
  )
932
  if not mp3:
933
  raise HTTPException(status_code=500, detail=msg)
934
+ return {"ok": True, "job_id": job_id, "path": mp3, "status": msg, "vram": vram, "release": RELEASE}
935
  finally:
936
  set_busy(False, None)
937
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
938
  def _start_fastapi():
939
  uvicorn.run(fastapp, host="0.0.0.0", port=8555, log_level="info")
940
 
941
+ # Load profiles from prompts.ini (required) and register endpoints
942
+ try:
943
+ PROFILES = load_profiles_from_ini(PROMPTS_FILE)
944
+ except Exception as e:
945
+ logger.error(f"Failed to load {PROMPTS_FILE}: {e}")
946
+ sys.exit(1)
947
+ register_style_endpoints(fastapp, PROFILES)
948
+
949
  api_thread = threading.Thread(target=_start_fastapi, daemon=True)
950
  api_thread.start()
951
+ logger.info(f"FastAPI server started on http://0.0.0.0:8555 ({RELEASE})")
952
 
953
  # ======================================================================================
954
+ # GRADIO UI (TABS + ACCESSIBLE THEME + 4→5-COLUMN GRID FOR BAND BUTTONS)
955
  # ======================================================================================
956
 
957
+ def read_css_text() -> str:
958
  try:
959
+ return CSS_FILE.read_text(encoding="utf-8")
 
 
 
960
  except Exception as e:
961
+ logger.warning(f"styles.css not found or unreadable: {e}")
962
+ return "" # no fallback CSS hard-coded
963
 
964
+ def read_example_md() -> str:
965
+ try:
966
+ return EXAMPLE_MD.read_text(encoding="utf-8")
967
+ except Exception as e:
968
+ logger.warning(f"example.md not found or unreadable: {e}")
969
+ return "## Info\nProvide an `example.md` to populate this tab."
970
 
971
+ def ui_prompt_from_style(style_key, bpm, *_):
972
+ return assemble_prompt(PROFILES, style_key, int(bpm), 1)
 
 
973
 
974
+ def get_latest_log() -> str:
975
+ try:
976
+ return LOG_FILE.read_text(encoding="utf-8") if LOG_FILE.exists() else "No log file yet."
977
+ except Exception as e:
978
+ return f"Error reading log: {e}"
979
+
980
+ def set_bitrate_128(): return "128k"
981
+ def set_bitrate_192(): return "192k"
982
+ def set_bitrate_320(): return "320k"
983
+ def set_sample_rate_22050(): return "22050"
984
+ def set_sample_rate_44100(): return "44100"
985
+ def set_sample_rate_48000(): return "48000"
986
+ def set_bit_depth_16(): return "16"
987
+ def set_bit_depth_24(): return "24"
988
+
989
+ CSS = read_css_text()
990
+ loaded = CURRENT_SETTINGS
991
+
992
+ logger.info(f"Building Gradio UI {RELEASE} ...")
993
+ with gr.Blocks(css=CSS, analytics_enabled=False, title=APP_TITLE, theme=gr.themes.Soft()) as demo:
994
+ with gr.TabItem(f"Generator • {RELEASE}", id="tab-generator"):
995
+ gr.Markdown(f"""
996
+ <div class="header" role="banner" aria-label="{APP_TITLE}">
997
+ <div class="logo" aria-hidden="true">👻</div>
998
+ <h1>{APP_TITLE}</h1>
999
+ <p>30/60/90/120s chunking · seamless joins · API + style endpoints</p>
1000
+ </div>
1001
+ """)
1002
+
1003
+ with gr.Column(elem_classes="input-container"):
1004
+ gr.Markdown("### Prompt")
1005
  instrumental_prompt = gr.Textbox(
1006
  label="Instrumental Prompt",
1007
+ placeholder="Type your instrumental prompt or click a style button",
1008
  lines=4,
1009
+ value=loaded.get("instrumental_prompt", ""),
1010
  )
1011
 
1012
+ gr.Markdown("#### Band / Style (auto grid: 4 per row, 5 on wide screens)")
1013
+ style_buttons = {}
1014
+ with gr.Group(elem_id="genre-grid"):
1015
+ # Put all buttons as direct children (no rows) so CSS grid works cleanly
1016
+ for key in PROFILES.keys():
1017
+ style_buttons[key] = gr.Button(PROFILES[key]["label"], elem_classes=["style-btn"])
1018
+
1019
+ with gr.Column(elem_classes="settings-container"):
1020
+ gr.Markdown("### Settings")
1021
+ with gr.Group(elem_classes="group-container"):
1022
+ cfg_scale = gr.Slider(1.0, 10.0, step=0.1, value=float(loaded.get("cfg_scale", DEFAULT_SETTINGS["cfg_scale"])), label="CFG Scale")
1023
+ top_k = gr.Slider(10, 500, step=10, value=int(loaded.get("top_k", DEFAULT_SETTINGS["top_k"])), label="Top-K")
1024
+ top_p = gr.Slider(0.0, 1.0, step=0.01, value=float(loaded.get("top_p", DEFAULT_SETTINGS["top_p"])), label="Top-P")
1025
+ temperature = gr.Slider(0.1, 2.0, step=0.01, value=float(loaded.get("temperature", DEFAULT_SETTINGS["temperature"])), label="Temperature")
1026
+ total_duration = gr.Dropdown(choices=[30, 60, 90, 120], value=int(loaded.get("total_duration", 60)), label="Song Length (seconds)")
1027
+ bpm = gr.Slider(60, 180, step=1, value=int(loaded.get("bpm", 120)), label="Tempo (BPM)")
1028
+ drum_beat = gr.Dropdown(choices=["none", "standard rock", "funk groove", "techno kick", "jazz swing", "orchestral percussion", "tympani"], value=str(loaded.get("drum_beat", "none")), label="Drum Beat")
1029
+ synthesizer = gr.Dropdown(choices=["none", "analog synth", "digital pad", "arpeggiated synth"], value=str(loaded.get("synthesizer", "none")), label="Synthesizer")
1030
+ rhythmic_steps = gr.Dropdown(choices=["none", "syncopated steps", "steady steps", "complex steps", "martial march", "triplet swells", "staccato ostinato"], value=str(loaded.get("rhythmic_steps", "none")), label="Rhythmic Steps")
1031
+ bass_style = gr.Dropdown(choices=["none", "slap bass", "deep bass", "melodic bass", "low brass", "cellos", "double basses"], value=str(loaded.get("bass_style", "none")), label="Bass / Low End")
1032
+ guitar_style = gr.Dropdown(choices=["none", "distorted", "clean", "jangle", "downpicked", "thrash riffing"], value=str(loaded.get("guitar_style", "none")), label="Guitar Style")
1033
+ target_volume = gr.Slider(-30.0, -20.0, step=0.5, value=float(loaded.get("target_volume", -23.0)), label="Target Loudness (dBFS RMS)")
1034
+ preset = gr.Dropdown(choices=["default", "rock", "techno", "grunge", "indie", "funk_rock"], value=str(loaded.get("preset", "default")), label="Preset")
1035
+ max_steps = gr.Dropdown(choices=[1000, 1200, 1300, 1500], value=int(loaded.get("max_steps", 1500)), label="Max Steps (per chunk hint)")
1036
+
1037
+ bitrate_state = gr.State(value=str(loaded.get("bitrate", "192k")))
1038
+ sample_rate_state = gr.State(value=str(loaded.get("output_sample_rate", "48000")))
1039
+ bit_depth_state = gr.State(value=str(loaded.get("bit_depth", "16")))
 
 
 
1040
 
1041
  with gr.Row():
1042
+ bitrate_128_btn = gr.Button("Bitrate 128k")
1043
+ bitrate_192_btn = gr.Button("Bitrate 192k")
1044
+ bitrate_320_btn = gr.Button("Bitrate 320k")
1045
  with gr.Row():
1046
+ sample_rate_22050_btn = gr.Button("SR 22.05k")
1047
+ sample_rate_44100_btn = gr.Button("SR 44.1k")
1048
+ sample_rate_48000_btn = gr.Button("SR 48k")
1049
  with gr.Row():
1050
+ bit_depth_16_btn = gr.Button("16-bit")
1051
+ bit_depth_24_btn = gr.Button("24-bit")
1052
 
1053
+ with gr.Row():
1054
  gen_btn = gr.Button("Generate Music 🚀")
1055
  clr_btn = gr.Button("Clear 🧹")
1056
  save_btn = gr.Button("Save Settings 💾")
1057
  load_btn = gr.Button("Load Settings 📂")
1058
  reset_btn = gr.Button("Reset Defaults ♻️")
1059
 
1060
+ with gr.Column(elem_classes="output-container"):
1061
  gr.Markdown("### Output")
1062
+ out_audio = gr.Audio(label="Generated Track", type="filepath")
1063
  status_box = gr.Textbox(label="Status", interactive=False)
1064
  vram_box = gr.Textbox(label="VRAM Usage", interactive=False, value="")
 
 
1065
 
1066
+ with gr.Column(elem_classes="logs-container"):
1067
+ gr.Markdown("### Logs")
1068
+ log_output = gr.Textbox(label="Last Log File", lines=16, interactive=False)
1069
+ log_btn = gr.Button("View Last Log")
1070
+
1071
+ # Wire style buttons -> prompt textbox
1072
+ for key, btn in style_buttons.items():
1073
+ btn.click(
1074
+ ui_prompt_from_style,
1075
+ inputs=[gr.State(key), bpm, drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style],
1076
+ outputs=instrumental_prompt
1077
+ )
1078
 
1079
+ # Quick sets
1080
+ bitrate_128_btn.click(set_bitrate_128, outputs=bitrate_state)
1081
+ bitrate_192_btn.click(set_bitrate_192, outputs=bitrate_state)
1082
+ bitrate_320_btn.click(set_bitrate_320, outputs=bitrate_state)
1083
+ sample_rate_22050_btn.click(set_sample_rate_22050, outputs=sample_rate_state)
1084
+ sample_rate_44100_btn.click(set_sample_rate_44100, outputs=sample_rate_state)
1085
+ sample_rate_48000_btn.click(set_sample_rate_48000, outputs=sample_rate_state)
1086
+ bit_depth_16_btn.click(set_bit_depth_16, outputs=bit_depth_state)
1087
+ bit_depth_24_btn.click(set_bit_depth_24, outputs=bit_depth_state)
1088
+
1089
+ # Generate
1090
+ gen_btn.click(
1091
+ generate_music,
1092
+ inputs=[
1093
+ instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
1094
+ drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style, target_volume,
1095
+ preset, max_steps, vram_box, bitrate_state, sample_rate_state, bit_depth_state
1096
+ ],
1097
+ outputs=[out_audio, status_box, vram_box]
1098
  )
1099
 
1100
+ # Clear
1101
+ clr_btn.click(
1102
+ clear_inputs, outputs=[
1103
+ instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
1104
+ drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style, target_volume,
1105
+ preset, max_steps, bitrate_state, sample_rate_state, bit_depth_state
1106
+ ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1107
  )
 
 
 
 
 
 
 
 
1108
 
1109
+ # Save / Load / Reset
1110
+ def _save_action(
1111
+ instrumental_prompt_v, cfg_v, top_k_v, top_p_v, temp_v, dur_v, bpm_v,
1112
+ drum_v, synth_v, steps_v, bass_v, guitar_v, vol_v, preset_v, maxsteps_v, br_v, sr_v, bd_v
1113
+ ):
1114
+ s = {
1115
+ "instrumental_prompt": instrumental_prompt_v,
1116
+ "cfg_scale": float(cfg_v),
1117
+ "top_k": int(top_k_v),
1118
+ "top_p": float(top_p_v),
1119
+ "temperature": float(temp_v),
1120
+ "total_duration": int(dur_v),
1121
+ "bpm": int(bpm_v),
1122
+ "drum_beat": str(drum_v),
1123
+ "synthesizer": str(synth_v),
1124
+ "rhythmic_steps": str(steps_v),
1125
+ "bass_style": str(bass_v),
1126
+ "guitar_style": str(guitar_v),
1127
+ "target_volume": float(vol_v),
1128
+ "preset": str(preset_v),
1129
+ "max_steps": int(maxsteps_v),
1130
+ "bitrate": str(br_v),
1131
+ "output_sample_rate": str(sr_v),
1132
+ "bit_depth": str(bd_v)
1133
+ }
1134
+ save_settings_to_file(s)
1135
+ for k, v in s.items():
1136
+ CURRENT_SETTINGS[k] = v
1137
+ return "✅ Settings saved."
1138
+
1139
+ def _load_action():
1140
+ s = load_settings_from_file()
1141
+ for k, v in s.items():
1142
+ CURRENT_SETTINGS[k] = v
1143
+ return (
1144
+ s["instrumental_prompt"], s["cfg_scale"], s["top_k"], s["top_p"], s["temperature"],
1145
+ s["total_duration"], s["bpm"], s["drum_beat"], s["synthesizer"], s["rhythmic_steps"],
1146
+ s["bass_style"], s["guitar_style"], s["target_volume"], s["preset"], s["max_steps"],
1147
+ s["bitrate"], s["output_sample_rate"], s["bit_depth"],
1148
+ "✅ Settings loaded."
1149
+ )
1150
+
1151
+ def _reset_action():
1152
+ s = DEFAULT_SETTINGS.copy()
1153
+ save_settings_to_file(s)
1154
+ for k, v in s.items():
1155
+ CURRENT_SETTINGS[k] = v
1156
+ return (
1157
+ s["instrumental_prompt"], s["cfg_scale"], s["top_k"], s["top_p"], s["temperature"],
1158
+ s["total_duration"], s["bpm"], s["drum_beat"], s["synthesizer"], s["rhythmic_steps"],
1159
+ s["bass_style"], s["guitar_style"], s["target_volume"], s["preset"], s["max_steps"],
1160
+ s["bitrate"], s["output_sample_rate"], s["bit_depth"],
1161
+ "✅ Defaults restored."
1162
+ )
1163
+
1164
+ save_btn.click(
1165
+ _save_action,
1166
+ inputs=[
1167
+ instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
1168
  drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style, target_volume,
1169
+ preset, max_steps, bitrate_state, sample_rate_state, bit_depth_state
1170
+ ],
1171
+ outputs=status_box
1172
+ )
1173
 
1174
+ load_btn.click(
1175
+ _load_action,
1176
+ outputs=[
1177
+ instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
1178
+ drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style, target_volume,
1179
+ preset, max_steps, bitrate_state, sample_rate_state, bit_depth_state, status_box
1180
+ ]
 
 
1181
  )
 
 
 
 
 
1182
 
1183
+ reset_btn.click(
1184
+ _reset_action,
1185
+ outputs=[
1186
+ instrumental_prompt, cfg_scale, top_k, top_p, temperature, total_duration, bpm,
1187
+ drum_beat, synthesizer, rhythmic_steps, bass_style, guitar_style, target_volume,
1188
+ preset, max_steps, bitrate_state, sample_rate_state, bit_depth_state, status_box
1189
+ ]
 
 
1190
  )
 
 
 
 
 
1191
 
1192
+ log_btn.click(get_latest_log, outputs=log_output)
1193
 
1194
+ with gr.TabItem("Info", id="tab-info"):
1195
+ gr.Markdown(read_example_md())
1196
+
1197
+ # ======================================================================================
1198
+ # LAUNCH GRADIO
1199
+ # ======================================================================================
1200
+
1201
+ logger.info(f"Launching Gradio UI at http://0.0.0.0:9999 ({RELEASE}) ...")
1202
  try:
1203
+ demo.launch(
1204
+ server_name="0.0.0.0",
1205
+ server_port=9999,
1206
+ share=False,
1207
+ inbrowser=False,
1208
+ show_error=True
1209
+ )
1210
  except Exception as e:
1211
+ logger.error(f"Failed to launch Gradio UI: {e}")
1212
  logger.error(traceback.format_exc())
1213
  sys.exit(1)
public/styles.css CHANGED
@@ -1,58 +1,79 @@
1
- /* styles.css */
2
- /* High-contrast, accessible theme (no inline HTML in Python). ADA-focused: focus rings, large targets, readable contrast. */
 
 
 
 
 
 
3
 
4
- :root { color-scheme: dark; }
 
5
 
6
- * { box-sizing: border-box; }
7
-
8
- body, .gradio-container {
9
- background: #0B0B0D !important;
10
- color: #FFFFFF !important;
11
- font-family: ui-sans-serif, system-ui, -apple-system, Segoe UI, Roboto, Helvetica, Arial, "Apple Color Emoji","Segoe UI Emoji";
12
- line-height: 1.4;
13
  }
14
 
15
- h1, h2, h3, h4, h5, h6, label, p, span {
16
- color: #FFFFFF !important;
 
 
 
 
 
 
17
  }
 
 
 
18
 
19
- .block, .panel, .wrap, .tabs, .tabitem, .form, .group {
20
- background: #0B0B0D !important;
 
21
  }
22
 
 
 
23
  input, textarea, select {
24
- background: #15151A !important;
25
- color: #FFFFFF !important;
26
- border: 1px solid #2B2B33 !important;
27
- border-radius: 10px !important;
28
- padding: 10px 12px !important;
29
  }
30
 
31
  button {
32
- background: #1F6FEB !important;
33
- color: #FFFFFF !important;
34
- border: 2px solid transparent !important;
35
- border-radius: 10px !important;
36
- padding: 12px 14px !important;
37
- font-weight: 700 !important;
38
- min-height: 44px; /* touch target */
39
  }
 
 
 
 
40
 
41
- button:hover { background: #2D7BFF !important; }
42
- button:focus { outline: 3px solid #00C853 !important; outline-offset: 2px; }
43
-
44
- .group > * + * { margin-top: 8px; }
45
- .row { gap: 8px; }
46
-
47
- .audio-wrap, .audio-display, .output-html {
48
- border: 1px solid #2B2B33 !important;
 
 
 
 
 
 
 
 
 
 
 
 
49
  border-radius: 10px !important;
 
50
  }
51
-
52
- .slider > input[type="range"] { accent-color: #FFD600 !important; }
53
-
54
- /* Large labels for readability */
55
- label { font-size: 16px !important; font-weight: 700 !important; }
56
-
57
- /* Subtle card border around control groups */
58
- .group { border: 1px solid #2B2B33; border-radius: 12px; padding: 12px; }
 
1
+ /* =========================
2
+ FILE: styles.css
3
+ ========================= */
4
+ :root {
5
+ color-scheme: dark;
6
+ --bg:#0B0B0D;
7
+ --panel:#101114;
8
+ --elev:#15161B;
9
 
10
+ --text:#F3F4F6;
11
+ --muted:#9CA3AF;
12
 
13
+ /* Accents use a professional triad rather than heavy blue */
14
+ --accent:#6EE7B7; /* mint (primary) */
15
+ --accent2:#FDE047; /* warm yellow (secondary) */
16
+ --accent3:#60A5FA; /* soft blue (tertiary) */
17
+ --focus:#22D3EE; /* cyan outline */
 
 
18
  }
19
 
20
+ body, .gradio-container { background: var(--bg) !important; color: var(--text) !important; }
21
+ * { color: var(--text) !important; }
22
+ .wrap, .block, .tabs, .panel, .form { background: transparent !important; }
23
+
24
+ .header {
25
+ text-align:center; padding: 14px 16px;
26
+ border-bottom: 2px solid var(--accent);
27
+ background: var(--panel);
28
  }
29
+ .header h1 { font-size: 28px; margin: 6px 0 0 0; }
30
+ .header .logo { font-size: 44px; }
31
+ .small { font-size: 12px; color: var(--muted) !important; }
32
 
33
+ .group {
34
+ border:1px solid #23242A; border-radius: 12px;
35
+ padding: 14px; margin-bottom: 14px; background: var(--elev);
36
  }
37
 
38
+ label, p, span, h2, h3, h4 { color: var(--text) !important; }
39
+
40
  input, textarea, select {
41
+ background: #0F1115 !important; color: var(--text) !important;
42
+ border:1px solid #252833 !important; border-radius: 10px !important;
 
 
 
43
  }
44
 
45
  button {
46
+ background: #1F2937 !important; color: var(--text) !important;
47
+ border: 1px solid #303644 !important; border-radius: 10px !important;
48
+ padding: 8px 12px !important; font-weight: 700 !important;
49
+ transition: border-color .15s ease, transform .05s ease;
 
 
 
50
  }
51
+ button:hover { background: #222D3D !important; border-color: var(--accent3) !important; }
52
+ button:active { transform: translateY(1px); }
53
+ button:focus { outline: 3px solid var(--focus) !important; }
54
+ .slider > input { accent-color: var(--accent3) !important; }
55
 
56
+ /* Compact grid for band/style buttons only */
57
+ #genre-grid {
58
+ display: grid;
59
+ grid-template-columns: repeat(5, minmax(140px, 1fr));
60
+ gap: 8px;
61
+ padding: 8px;
62
+ border: 1px solid #23242A;
63
+ border-radius: 12px;
64
+ background: var(--elev);
65
+ max-height: 320px;
66
+ overflow: auto;
67
+ }
68
+ @media (max-width: 1200px) {
69
+ #genre-grid { grid-template-columns: repeat(4, minmax(140px, 1fr)); }
70
+ }
71
+ #genre-grid > * { margin: 0 !important; }
72
+ #genre-grid button {
73
+ padding: 6px 10px !important;
74
+ font-size: 0.9rem !important;
75
+ line-height: 1.15 !important;
76
  border-radius: 10px !important;
77
+ border-color: #2B3140 !important;
78
  }
79
+ #genre-grid button:hover { border-color: var(--accent2) !important; }