io.github.rsmdt/multimodal
Multi-provider media generation — images, video, audio, and transcription via a unified interface
Verdict not yet evaluated for this tool. The semantic screen takes adversarial cases first; coverage rolls out as the corpus expands (15/150 labels to graduation). The deterministic conformance probe is built but has not yet run on the public corpus, so a recorded verdict here is REVIEW or UNVERIFIED, never a clearing ALLOW. Until a verdict is recorded, an agent should treat this tool as not-yet-cleared and fall back to its own checks. Method: the eval, four-state verdict, honest limits.
Own this server? Screen its description →
OPENAI_API_KEYOpenAI API key for image, video, audio generation and transcription
XAI_API_KEYxAI API key for image and video generation
GEMINI_API_KEYGoogle Gemini API key for image, video, and audio generation
ELEVENLABS_API_KEYElevenLabs API key for audio generation and transcription
BFL_API_KEYBFL API key for FLUX image generation and editing
MEDIA_OUTPUT_DIRDirectory for saved media files (defaults to cwd)
Run 150+ AI apps — image, video, audio, LLMs, 3D and more. Browse, execute, stream results.
Provision private AI model endpoints on dedicated GPUs (Llama, Qwen, Mistral). Pay per minute.
Contabo API (v1.0.0) as MCP tools for cloud provisioning, and management. Powered by HAPI MCP server