# Feature Fusion Network ## Model Architecture - **Type**: Multi-Modal Hybrid (CNN + Transformer) - **Pathway 1 (Spatial)**: ResNet3D (r3d_18) for robust localized feature extraction. - **Pathway 2 (Spatiotemporal)**: TimeSformer (Transformer) block dealing with patches and frames to capture long-range dependencies. - **Fusion**: Late fusion via concatenation of flattened feature vectors (512 features from CNN + 256 features from Transformer). - **Classification Head**: MLP mapping fused features to binary classes. ## Dataset Structure Expects `Dataset` folder in parent directory. ``` Dataset/ ├── violence/ └── no-violence/ ``` ## How to Run 1. Install dependencies: `torch`, `opencv-python`, `scikit-learn`, `numpy`, `torchvision`. 2. Run `python train.py`.