
Parametre
Viac o knihe
How do we explain a picture to someone else? We describe its colors, shapes, and the relationships between objects. Similarly, to clarify a verbal statement, we might show a picture that visualizes its content. In everyday communication, people use various methods simultaneously to convey their intentions, such as pointing, facial expressions, gestures, or referencing their shared environment. This multimodal approach should also apply to human-computer interfaces, leading to a paradigm shift from passive interactions like mouse clicks to active communication partners that interpret auditory and visual cues, draw inferences, and seek additional information. Such an interface is termed an artificial communicator. However, merely interpreting signals from individual modalities—like speech, gestures, or visual recognition—is insufficient. To create systems that communicate naturally, integrating these modalities is crucial. Each modality has its distinct vocabulary and expressiveness: pointing indicates interest, facial expressions convey emotions, speech provides factual information, and vision interprets shapes. This thesis will explore the most effective formalism for integrating the outputs of specialized processing components in a multimodal system. It will address how to connect these components and organize processing, offering innovative solutions and a practical realization in a specific domain.
Nákup knihy
Multi-modal scene understanding using probabilistic models, Sven Wachsmuth
- Jazyk
- Rok vydania
- 2003
- product-detail.submit-box.info.binding
- (mäkká)
Doručenie
Platobné metódy
Nikto zatiaľ neohodnotil.