Industrial equipment, as complex multi-sensor systems, imposes stringent requirements for safe and reliable operation. Given the challenges in accessing fault labels in real-world scenarios, unsupervised anomaly detection becomes particularly crucial for monitoring equipment health. However, existing methods are often compromised by signal noise and struggle to adapt to temporal variations. To address these issues, we introduce a data-driven fault detection framework utilizing novel Spectral-Temporal Fusion Networks (STF-Nets). STF-Nets integrate time-frequency domain information to achieve stable predictions of reconstruction windows. By comparing the outputs with predefined thresholds, faults within industrial systems can be swiftly identified. Our proposed framework has been validated using the Tennessee-Eastman dataset, with experimental results demonstrating its robustness and state-of-the-art performance.