Text, Speech, and Dialogue, Kartoniert / Broschiert
Text, Speech, and Dialogue
- 28th International Conference, TSD 2025, Erlangen, Germany, August 25-28, 2025, Proceedings, Part I
(soweit verfügbar beim Lieferanten)
- Herausgeber:
- Kamil Ek¿tein, Miloslav Konopík, Ond¿ej Pra¿ák, Franti¿ek Pártl
- Verlag:
- Springer, 08/2025
- Einband:
- Kartoniert / Broschiert
- Sprache:
- Englisch
- ISBN-13:
- 9783032025470
- Artikelnummer:
- 12357957
- Umfang:
- 432 Seiten
- Gewicht:
- 651 g
- Maße:
- 235 x 155 mm
- Stärke:
- 24 mm
- Erscheinungstermin:
- 22.8.2025
- Hinweis
-
Achtung: Artikel ist nicht in deutscher Sprache!
Weitere Ausgaben von Text, Speech, and Dialogue |
Preis |
---|---|
Buch, Kartoniert / Broschiert, Englisch | EUR 142,37* |
Klappentext
.- Speech.
.- Lightweight Target-Speaker-Based Overlap Transcription for Practical Streaming ASR.
.- An Empirical Analysis of Discrete Unit Representations in Speech Language Modeling Pre-training.
.- Optimizing ASR Models with Semantic Information.
.- Efficient Enhancement of Norwegian ASR Model.
.- Towards Stable and Personalised Profiles for Lexical Alignment in Spoken Human-Agent Dialogue.
.- Audio--Vision Contrastive Learning for Phonological Class Recognition.
.- TOSD-Net: A CNN-Transformer Architecture for Robust Frame-Level Overlapping Speech Detection in Diverse Acoustic Conditions.
.- An Exploration of ECAPA-TDNN and x-vector Speaker Representations in Zero-shot Multi-speaker TTS.
.- Emotion-Aware Speech-Driven Facial Avatar Animation via Joint Blendshape Prediction and Emotion Recognition.
.- Beyond Static Emotions: Leveraging Multitask Learning to Model Dynamics of Dimensional Affect in Speech.
.- Implicit Speaker Group Encoding in Self-supervised Speech Recognition Models.
.- Combining Temporal Visual Dynamics and Audio Representations for Robust Speaker Identification.
.- Sentences vs Phrases in Neural Speech Synthesis: the Phrases Strike Back.
.- Evaluating Phoneme-Level Pretraining in Czech Text-to-Speech Synthesis.
.- Unifying Global and Near-Context Biasing in a Single Trie Pass.
.- Synthesising Cross-Speaker Data for Low-Resource Pathological Speech Recognition with PEFT.
.- Multilingual Stutter Event Detection for English, German, and Mandarin Speech.
.- How Far Can Synthetic Speech Go? Enhancing ASR in Low-Resource Scenarios via Voice Cloning.
.- Enhancing Detection of Parkinson-induced Dysarthria with Cross-lingual Transfer Learning.
.- Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks.
.- Detection of Cognitive Disorders Using ASR-Based Nonsense Words Repetition.
.- Mind the Gap: Entity-Preserved Context-Aware ASR for Structured Transcriptions.
.- Boosting CTC-Based ASR Using LLM-Based Intermediate Loss Regularization.
.- Robust Disfluency Labeling in Spontaneous Speech: Insights from Diverse Hungarian Corpora Including Mentally Ill Speakers.
.- ParCzech4Speech: A New Speech Corpus Derived from Czech Parliamentary Data.
.- Towards an Accurate Domain-Specific ASR: Transcription for Pathology.
.- Automated Speaking Assessment for L2 Learners of Czech.
.- Inclusive ASR for Critical Public Services: Debiasing with Actor-Simulated Speech.
.- RECA-PD: A Robust Explainable Cross-Attention Method for Speech-based Parkinson's Disease Classification.
.- Systematic FAIRness Assessment of Open Voice Biomarker Datasets for Mental Health and Neurodegenerative Diseases.
.- When Silence Speaks: Understanding Open-Ended Responses via LLMs in Therapeutic Voice Interaction.
.- Multilingual Domain Adaptation for Speech Recognition Using LLMs.
.- Using Cross-attention For Conversational ASR Over The Telephone.
