Repository | Book | Chapter

Cross-modal approach for karaoke artifacts correction

Wei-Qi Yan, Mohan S. Kankanhalli

pp. 197-218

In this chapter, we combine adaptive sampling in conjunction with video analogies (VA) to correct the audio stream in the karaoke environment (kappa= left {kappa (t) : kappa (t) = left (U(t), K(t) ight ), t in left ({t}_{s}, {t}_{e} ight ) ight }) where t s and t e are start time and end time respectively, U(t) is the user multimedia data. We employ multiple streams from the karaoke data (K(t) = left ({K}_{V }(t), {K}_{M}(t), {K}_{S}(t) ight )), where K V (t), K M (t) and K S (t) are the video, musical accompaniment and original singer's rendition respectively along with the user multimedia data (U(t) = left ({U}_{A}(t),{U}_{V }(t) ight )) where U V (t) is the user video captured with a camera and U A (t) is the user's rendition of the song. We analyze the audio and video streaming features (Psi (kappa ) = left {Psi (U(t), K(t)) ight } = left {Psi (U(t)), Psi (K(t)) ight } = left {{Psi }_{U}(t), {Psi }_{K}(t) ight }), to produce the corrected singing, namely output U (t), which is made as close as possible to the original singer's rendition. Note that Ψ represents any kind of feature processing.

Publication details

DOI: 10.1007/978-0-387-89024-1_9

Full citation:

Yan, W. , Kankanhalli, M. S. (2009)., Cross-modal approach for karaoke artifacts correction, in B. Furht (ed.), Handbook of multimedia for digital entertainment and arts, Dordrecht, Springer, pp. 197-218.

This document is unfortunately not available for download at the moment.