GenAI Safety Leaderboard

Compare safety ratings and risk scores of leading AI models

Rank ModelRating Performance Performance vs Risk Risk Score Source
1
gemini-1.5-pro-exp-0801
85.90%
2.77:1(Excellent)
31
GoogleGoogle
2
gemini-1.5-pro-latest
85.90%
3.07:1(Excellent)
28
GoogleGoogle
3
gemma-2-27b-it
75.20%
2.51:1(Excellent)
30
HuggingFaceHuggingFace
4
Reflection-Llama-3.1-70B
79.00%
2.32:1(Excellent)
34
HuggingFaceHuggingFace
5
Llama-2-7B-Chat-GGUF-8bit
45.80%
1.35:1(Good)
34
HuggingFaceHuggingFace
6
Llama-2-7B-Chat-GGUF-4bit
45.80%
1.35:1(Good)
34
HuggingFaceHuggingFace
7
SmolLM-360M-Instruct
34.17%
0.92:1(Poor)
37
HuggingFaceHuggingFace
8
llama-2-7b-chat-hf
47.33%
1.28:1(Good)
37
Together.aiTogether.ai
9
flan-ul2
55.56%
1.39:1(Good)
40
GoogleGoogle
10
o1-preview
90.80%
2.27:1(Excellent)
40
OpenAIOpenAI
11
Llama-3-8B-Instruct-RR
68.40%
1.40:1(Good)
49
HuggingFaceHuggingFace
12
claude-3-opus-20240229
88.20%
2.00:1(Excellent)
44
AnthropicAnthropic
13
gpt-4-0125-preview
86.40%
1.73:1(Good)
50
OpenAIOpenAI
14
sarvam-2b-v0.5
N/A
NA(Poor)
46
HuggingFaceHuggingFace
15
Llama-3-8B-Instruct-MopeyMule
68.40%
1.49:1(Good)
46
HuggingFaceHuggingFace
17
claude-3-5-sonnet-20240620
88.70%
1.74:1(Good)
51
AnthropicAnthropic
18
sea-lion-7b-instruct
26.87%
0.50:1(Poor)
54
HuggingFaceHuggingFace
19
claude-instant-1.2
73.40%
1.27:1(Good)
58
AnthropicAnthropic
20
PowerLM-3b-EAI-Aligned
31.40%
0.51:1(Poor)
61
HuggingFaceHuggingFace
21
gpt-4-turbo-2024-04-09
86.67%
1.49:1(Good)
58
OpenAIOpenAI
22
Meta-Llama-3.1-8B-Instruct-Turbo
69.40%
1.22:1(Good)
57
Together.aiTogether.ai
23
RakutenAI-7B-chat
60.32%
1.06:1(Good)
57
HuggingFaceHuggingFace
24
gemma-2-2b-it
42.30%
0.74:1(Poor)
57
GoogleGoogle
25
Meta-Llama-3-8B-Instruct
66.54%
1.06:1(Good)
63
MetaMeta
26
o1-mini
85.20%
1.37:1(Good)
62
OpenAIOpenAI
27
Mistral-7B-v0.1
60.10%
0.99:1(Poor)
61
HuggingFaceHuggingFace
28
QwQ-32B-Preview
N/A
NA(Poor)
60
Together.aiTogether.ai
29
Llama-2-13b-chat-hf
54.80%
0.84:1(Poor)
65
Together.aiTogether.ai
30
Mistral-7B-Instruct-v0.2-EAI-Aligned
61.00%
0.97:1(Poor)
63
HuggingFaceHuggingFace
31
h2o-danube3-500m-chat
26.33%
0.42:1(Poor)
62
HuggingFaceHuggingFace
32
granite-3.0-1b-a400m-instruct
32.00%
0.52:1(Poor)
61
HuggingFaceHuggingFace
34
mistral.mistral-7b-instruct-v0.2
55.40%
0.76:1(Poor)
73
AWSAWS
35
Llama-2-70b-chat-hf
63.90%
0.93:1(Poor)
69
Together.aiTogether.ai
36
gemma-2-9b-it
31.94%
0.47:1(Poor)
68
GoogleGoogle
37
internlm2-chat-20b
66.50%
1.11:1(Good)
60
HuggingFaceHuggingFace
38
Llama-3.2-1B-instruct
49.30%
0.75:1(Poor)
66
HuggingFaceHuggingFace
39
gemma-2-9b
71.30%
1.08:1(Good)
66
GoogleGoogle
40
Llama-3.2-3B-Instruct
63.40%
1.01:1(Good)
63
HuggingFaceHuggingFace
41
NexusRaven-V2-13B
44.88%
0.67:1(Poor)
67
HuggingFaceHuggingFace
42
Qwen2.5-0.5B-Instruct
24.10%
0.37:1(Poor)
66
HuggingFaceHuggingFace
43
Qwen2.5-1.5B-Instruct
50.70%
0.78:1(Poor)
65
HuggingFaceHuggingFace
44
komodo-7b-base
N/A
NA(Poor)
67
HuggingFaceHuggingFace
45
gpt-4o
88.70%
1.25:1(Good)
71
OpenAIOpenAI
46
phi-2
58.40%
0.90:1(Poor)
65
HuggingFaceHuggingFace
47
Llama-3.2-11B-Vision-Instruct-Turbo
73.00%
0.99:1(Poor)
74
Together.aiTogether.ai
48
phi3-medium-128K
78.20%
1.13:1(Good)
69
MicrosoftMicrosoft
49
gemma-7b-it
66.10%
0.96:1(Poor)
69
Together.aiTogether.ai
50
claude-3-haiku-20240307
76.70%
1.01:1(Good)
76
AnthropicAnthropic
51
Meta-Llama-3.1-65B-Instruct-Turbo
88.60%
1.27:1(Good)
70
Together.aiTogether.ai
52
SmolLM-1.7B-instruct
39.97%
0.56:1(Poor)
71
HuggingFaceHuggingFace
53
Mistral-NaMo-Meitron-8B-Instruct
70.40%
1.07:1(Good)
66
HuggingFaceHuggingFace
54
gpt-4o-2024-08-06
88.70%
1.20:1(Good)
74
OpenAIOpenAI
55
granite-3.0-2b-a800m-instruct
50.16%
0.70:1(Poor)
72
HuggingFaceHuggingFace
56
amazon.nova-pro-v1.0
85.90%
1.09:1(Good)
79
AWSAWS
57
amazon.nova-lite-v1.0
80.50%
0.99:1(Poor)
81
AWSAWS
58
Meta-Llama-3-70B-Instruct
82.00%
1.05:1(Good)
78
MetaMeta
59
Starling-LM-7B-beta-GGUF-4bit
63.90%
0.91:1(Poor)
70
HuggingFaceHuggingFace
60
Smaug-72B-v0.1
77.15%
0.98:1(Poor)
79
HuggingFaceHuggingFace
61
claude-3-5-haiku-20240122
79.63%
0.98:1(Poor)
81
AnthropicAnthropic
62
gpt-3.5-turbo
70.00%
0.84:1(Poor)
83
OpenAIOpenAI
63
granite-3.0-8b-instruct
65.82%
0.79:1(Poor)
83
HuggingFaceHuggingFace
64
CodeLlama-7b-instruct-hf
34.54%
0.44:1(Poor)
79
HuggingFaceHuggingFace
65
Smaug-Llama-3-70B-Instruct
79.20%
1.00:1(Good)
79
HuggingFaceHuggingFace
66
Mistral-8x7B-instruct-v0.1
70.33%
0.91:1(Poor)
77
HuggingFaceHuggingFace
67
jamba-instruct-preview
N/A
NA(Poor)
74
ADHstudioADHstudio
68
Liquid-40B
78.76%
1.09:1(Good)
72
MicrosoftMicrosoft
69
Mixtral-8x22B-instruct-v0.1
77.71%
1.01:1(Good)
77
Together.aiTogether.ai
70
SeaLM-7B-v2
64.90%
0.76:1(Poor)
83
HuggingFaceHuggingFace
71
Qwen2-72B-instruct
82.30%
1.03:1(Good)
80
HuggingFaceHuggingFace
72
Qwen1.5-14B-Chat
68.52%
0.85:1(Poor)
81
Together.aiTogether.ai
73
Yi-34B-Chat
75.70%
0.97:1(Poor)
78
HuggingFaceHuggingFace
74
Smaug-34B-v0.1
77.29%
0.89:1(Poor)
87
HuggingFaceHuggingFace
75
c4ai-command-r-plus
75.70%
0.97:1(Poor)
78
HuggingFaceHuggingFace
76
mistral-small-latest
72.20%
0.88:1(Poor)
82
MistralMistral
77
Qwen2.7B-Instruct
70.50%
0.87:1(Poor)
81
HuggingFaceHuggingFace
78
Qwen2.5-7.2B-instruct
86.80%
1.10:1(Good)
79
HuggingFaceHuggingFace
79
Mistral-7B-Instruct-v0.2-GGUF-4bit
N/A
NA(Poor)
79
HuggingFaceHuggingFace
80
Meta-Llama-3.1-70B-Instruct-Turbo
83.36%
1.04:1(Good)
80
Together.aiTogether.ai
81
K2-Chat
63.50%
0.77:1(Poor)
83
HuggingFaceHuggingFace
82
Phi-3-mini-4k-instruct
68.80%
0.83:1(Poor)
83
HuggingFaceHuggingFace
83
Starling-LM-7B-beta
63.90%
0.75:1(Poor)
85
HuggingFaceHuggingFace
84
OLMo-6-7B-0924-instruct
N/A
NA(Poor)
83
HuggingFaceHuggingFace
85
Mistral-7B-Instruct-v0.2-GGUF-8bit
N/A
NA(Poor)
83
HuggingFaceHuggingFace
86
h2o-danube3-4b-chat
54.74%
0.67:1(Poor)
82
HuggingFaceHuggingFace
87
RakutenAI-7B-Instruct
60.32%
0.76:1(Poor)
79
HuggingFaceHuggingFace
88
Mistral-7B-instruct-v0.2
60.07%
0.72:1(Poor)
84
HuggingFaceHuggingFace
89
granite-3.0-2b-instruct
56.03%
0.64:1(Poor)
87
HuggingFaceHuggingFace
90
jamba-1.5-mini
69.70%
0.80:1(Poor)
87
ADHstudioADHstudio
91
aye-23-35B
58.20%
0.68:1(Poor)
86
HuggingFaceHuggingFace
92
jamba-1.5-large
80.00%
0.93:1(Poor)
88
ADHstudioADHstudio
93
Phi-3-small-8k-instruct
71.10%
0.81:1(Poor)
88
HuggingFaceHuggingFace
94
Phi-3-small-128k-instruct
75.30%
0.87:1(Poor)
87
HuggingFaceHuggingFace
95
Qwen2.5-5B-Instruct
64.40%
0.75:1(Poor)
86
HuggingFaceHuggingFace
96
Llama-3.1-Nemotron-70B-instruct-hf
83.51%
0.95:1(Poor)
88
Together.aiTogether.ai
97
zephyr-7b-beta
61.07%
0.72:1(Poor)
85
HuggingFaceHuggingFace
98
Qwen2.5-7B-instruct
N/A
NA(Poor)
84
HuggingFaceHuggingFace
99
PowerMoE-3b
42.80%
0.48:1(Poor)
90
HuggingFaceHuggingFace
100
LongWriter-gm4-9b
58.70%
0.66:1(Poor)
89
HuggingFaceHuggingFace
101
Qwen2.5-32B-Instruct
83.90%
0.95:1(Poor)
88
HuggingFaceHuggingFace
102
Mistral-7B-Instruct-v0.1-GGUF-4bit
N/A
NA(Poor)
83
HuggingFaceHuggingFace
103
snowflake-arctic-instruct
67.30%
0.75:1(Poor)
90
SnowflakeSnowflake
104
Qwen2.5-7B-Instruct
75.40%
0.84:1(Poor)
90
HuggingFaceHuggingFace
105
palm-2-chat-bison
78.30%
0.92:1(Poor)
85
GoogleGoogle
106
Mistral-7B-instruct-v0.1-GGUF-8bit
N/A
NA(Poor)
86
HuggingFaceHuggingFace
107
glm-4-9b-chat
56.60%
0.63:1(Poor)
90
HuggingFaceHuggingFace
108
Phi-3-medium-4k-instruct
78.00%
0.87:1(Poor)
90
HuggingFaceHuggingFace
109
aya-23-8B
48.20%
0.55:1(Poor)
87
HuggingFaceHuggingFace
110
Llama-3.1-70B-Instruct-Turbo
86.00%
0.95:1(Poor)
91
Together.aiTogether.ai
111
Mistral-7B-instruct-v0.3
81.84%
0.89:1(Poor)
90
Together.aiTogether.ai

What is Risk Score?

The risk score is an average of risk in four categories: jailbreak susceptibility, bias potential, malware presence, and toxicity assessment. Lower the score, lower the risk.

How to read Jailbreak, Bias, Malware and Toxicity Scores?

A Jailbreak score of 18% indicates that 18% of the jailbreak tests successfully breached the LLM.

Data Source: enkryptai.com