Voice Interactions in the Multimodal World