Abstract: In this paper we present the differentiable log-Mel spectrogram (DMEL) for audio classification. DMEL uses a Gaussian window, with a window length that can be jointly optimized with the ...
Abstract: In this paper, we propose a deep learning (DL)-based task-driven spectrum prediction framework, named DeepSPred. The DeepSPred comprises a feature encoder and a task predictor, where the ...
This study proposes a novel heterogeneous stacking ensemble learning model for the fusion of phonocardiogram (PCG) spectrogram texture and deep features to detect heart failure with preserved ejection ...
Diffusion Speech is a diffusion-based text-to-speech model. Our speech synthesis pipeline is quite simple. We use a diffusion transformer model (DiT) to predict the duration of each phoneme. Then we ...
The development of machine learning for cardiac care is severely hampered by privacy restrictions on sharing real patient electrocardiogram (ECG) data. Although generative AI offers a promising ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results