Supported Models

Supported Models#

Flywheel supports a variety of Transformer models. Below we provide the model architectures that are currently supported. Alongside each architecture, we include some popular models that use it and support for NVIDIA and AMD.

Architecture

Models

Example Models

NVIDIA

AMD

LlamaForCausalLM

CodeLlama,
Llama 1,
Llama 2,
Llama 3,
Smaug,
Yi,
etc.

code-llama/CodeLlama-7B,
meta-llama/Meta-Llama-3-8B,
meta-llama/Meta-Llama-3-70B,
abacusai/Smaug-34B,
01-ai/Yi-9B,
etc.

✅︎

✅︎

MistralForCausalLM

Mistral

mistralai/Mistral-7B,
etc.

✅︎

✅︎

MixtralForCausalLM

Mixtral

mistralai/Mixtral-8x7B,
mistralai/Mixtral-8x22B,
etc.

✅︎

✅︎

GemmaForCausalLM

Gemma 2

google/gemma2-27B

✅︎

-

Qwen2ForCausalLM

Qwen1.5

Qwen/Qwen1.5-7B,
Qwen/Qwen1.5-11B,
etc.

✅︎

✅︎

MPTForCausalLM

Mosaic,
SEA-LION

aisingapore/sea-lion-7B,
mosaicml/mpt-7B

✅︎

-

DbrxForCausalLM

dbrx

databricks/dbrx-instruct

-

✅︎