Mixtral 8x7b - Mistral AI

Mixtral 8x7B: High-quality sparse model, 6x faster than Llama 2 70B. Best cost/performance, multilingual, fine-tunable. Outperforms GPT3.5.

Home /
AI Model Downloads /
Large Language Models /
Mixtral 8x7b - Mistral AI

Model Overview

High-quality sparse mixture-of-experts model
Licensed under Apache 2.0
Outperforms Llama 2 70B with 6x faster inference
Best cost/performance trade-offs, matching or surpassing GPT3.5

Capabilities

Handles a context of 32k tokens
Supports English, French, Italian, German, and Spanish
Strong performance in code generation
Fine-tunable for instruction-following tasks, achieving 8.3 on MT-Bench

Sparse Architectures

Sparse mixture-of-experts network
Decoder-only model with 8 distinct parameter groups
Router network chooses experts for token processing
46.7B total parameters, uses only 12.9B parameters per token

Performance Comparison

Outperforms Llama 2 70B and GPT3.5 on most benchmarks
Efficient models compared to Llama 2 family
Detailed results provided for performance overview

Bias and Language

Less bias on BBQ benchmark compared to Llama 2
Displays positive sentiments on BOLD with similar variances

Instructed Models

Releases Mixtral 8x7B Instruct optimized for instruction-following
Reaches a score of 8.30 on MT-Bench, comparable to GPT3.5

Open-Source Deployment

Submitted changes to vLLM project for open-source deployment
Skypilot enables vLLM endpoints deployment on any cloud instance

Platform Usage

Mixtral 8x7B available on the mistral-small endpoint in beta
Early access registration for generative and embedding endpoints

Acknowledgements

Thanks to CoreWeave and Scaleway teams for technical support during model training.

Follow AI Models on Google News

An easy & free way to support AI Models is to follow our google news feed! More followers will help us reach a wider audience!

Google News: AI Models

~ Sharing is Caring ~

Size: None

License: apache-2.0

Related Posts

Freddie Mercury RVC Model AI Voice

Freddie Mercury RVC Model AI Voice

Introducing AI Freddie Mercury! Our collection features a range of styles & languages, showcasing VITS Retrieval-based Voice Conversion methods created by a community of AI enthusiasts.

English female TTS Model tacotron2 DCA Encoding Trained on ljspeech Dataset at 22050Hz

English female TTS Model tacotron2 DCA Encoding Trained on ljspeech Dataset at 22050Hz

English female text-to-speech model trained on the ljspeech dataset at 22050 Hz and is available to synthesize the English language.

Arijit Singh RVC Model AI Voice

Arijit Singh RVC Model AI Voice

Experience the transformative power of AI with the new collection of songs by AI Arijit Singh.