Abstract
In this paper, we present a cartoon face animation system for multimedia HCI applications. We animate face cartoons not only from input speech, but also based on emotions derived from speech signal. Using a corpus of over 700 utterances from different speakers, we have trained SVMs (support vector machines) to recognize four categories of emotions: neutral, happiness, anger and sadness. Given each input speech phrase, we identify its emotion content as a mixture of all four emotions, rather than classifying it into a single emotion. Then, facial expressions are generated from the recovered emotion for each phrase, by morphing different cartoon templates that correspond to various emotions. To ensure smooth transitions in the animation, we apply low-pass filtering to the recovered (and possibly jumpy) emotion sequence. Moreover, lip-syncing is applied to produce the lip movement from speech, by recovering a statistical audio-visual mapping. Experimental results demonstrate that cartoon animation sequences generated by our system are of good and convincing quality.
| Original language | English |
|---|---|
| Pages | 365-371 |
| Number of pages | 7 |
| DOIs | |
| Publication status | Published - 2001 |
| Externally published | Yes |
| Event | -ACM Multimedia 2001 Workshops- 2001 Multimedia Conference - Ottawa, Ont., Canada Duration: 30 Sept 2001 → 5 Oct 2001 |
Conference
| Conference | -ACM Multimedia 2001 Workshops- 2001 Multimedia Conference |
|---|---|
| Country/Territory | Canada |
| City | Ottawa, Ont. |
| Period | 30/09/01 → 5/10/01 |
Keywords
- Cartoon animation
- Lip-syncing
- Multimedia HCI
- Speech emotion recognition