The actual operating conditions of a gearbox are complex, and it is impossible to comprehensively diagnose faults relying solely on information from a single source. To address the above - mentioned problems, a gearbox fault diagnosis model based on multi - source information fusion is proposed. The original vibration signal is converted into a one - dimensional frequency - domain signal and a two - dimensional time - frequency diagram through the Fast Fourier Transform and the Continuous Wavelet Transform. The regularization technique is adopted to extract the operating condition data from the operating condition text information. Transformer, ViT, and CNN are used to perform feature learning on the above three types of information respectively, and then the features are spliced. The spliced features are input into the classification head to achieve fault identification of different operating conditions of the gearbox. In this paper, the gearbox dataset of Southeast University is used to verify the feasibility of the model, and the average fault identification rate for different operating conditions reaches 99.4%. In addition, through comparative ablation experiments, the noise - resistance and superiority of the model are demonstrated. The experimental results show that the gearbox fault diagnosis model based on multi - source information fusion can achieve fault diagnosis of different operating conditions of the gearbox.