Voice conversion based on continuous frequency warping and magnitude scaling

Yuhang Ye, Bob Lawlor

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we present a novel spectrum mapping method-Continuous Frequency Warping and Magnitude Scaling (CFWMS) for voice conversion under the Joint Density Gaussian Mixture Model (JDGMM) framework. JDGMM is a mature clustering technique that models the joint probability density of speech signals from paired speakers. The conventional JDGMM-based approaches morph the spectral features via least square optimization. However, the speech quality is degenerated as the converted features are blurred by statistical smoothing and the uncorrelated conversion functions between adjacent frames cause noticeable distortion. To this end, CFWMS proposes a twofold frame-level conversion method-Frequency Warping and Magnitude Scaling (FWMS). FWMS directly operates on signals in the frequency domain without statistical smoothing. Moreover, a trajectory limitation strategy is introduced to renovate the discontinuities between adjacent frames. Note that the proposed solution does not require global information of sentences, making it feasible for low latency (e.g. real-time) applications. The experimental results show significantly improvements in terms of the speech quality and the perceptual identity.

Original languageEnglish
Title of host publication2017 28th Irish Signals and Systems Conference, ISSC 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781538610466
DOIs
Publication statusPublished - 18 Jul 2017
Event28th Irish Signals and Systems Conference, ISSC 2017 - Killarney, Ireland
Duration: 20 Jun 201721 Jun 2017

Publication series

Name2017 28th Irish Signals and Systems Conference, ISSC 2017

Conference

Conference28th Irish Signals and Systems Conference, ISSC 2017
Country/TerritoryIreland
CityKillarney
Period20/06/1721/06/17

Keywords

  • Analysis by Synthesis framework
  • Clustering
  • Freqeuency Warping
  • Regression
  • Voice Conversion

Fingerprint

Dive into the research topics of 'Voice conversion based on continuous frequency warping and magnitude scaling'. Together they form a unique fingerprint.

Cite this