Audio Speech Recognition With Glm Asr 2512

1

@z_ai/mcp-serverMCP Server43/100

via “audio speech recognition with glm-asr-2512”

MCP Server for Z.AI - A Model Context Protocol server that provides AI capabilities

Unique: Provides MCP interface to GLM-ASR-2512 speech recognition model with streaming support for long audio, enabling voice input integration into MCP-based agents without separate audio processing infrastructure

vs others: Simpler than managing separate ASR APIs; integrated into Z.AI MCP server alongside text, vision, and video models

2

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (AudioGPT)Product23/100

via “speech-to-text-understanding-via-asr”

* ⭐ 05/2023: [ImageBind: One Embedding Space To Bind Them All (ImageBind)](https://openaccess.thecvf.com/content/CVPR2023/html/Girdhar_ImageBind_One_Embedding_Space_To_Bind_Them_All_CVPR_2023_paper.html)

Unique: unknown — insufficient data on ASR architecture, model selection, or implementation approach. Paper abstract does not specify whether AudioGPT uses proprietary ASR, open-source models (Whisper, etc.), or custom foundation models.

vs others: unknown — no performance benchmarks, accuracy metrics, or latency comparisons provided against alternative ASR systems

3

CS224S: Spoken Language Processing - Stanford UniversityProduct21/100

via “speech recognition system architecture and design”

![](https://img.shields.io/badge/Level-Medium-yellow)

Unique: Bridges classical statistical ASR (HMMs, GMMs) with modern neural approaches, teaching both the historical context and current best practices. Emphasizes the modular pipeline architecture (acoustic model → language model → decoder) rather than treating end-to-end models as black boxes.

vs others: More comprehensive than industry tutorials focused on using pre-trained models; more practical than purely theoretical courses on speech signal processing

Top Matches

Also Known As

Company