GenAI Safety Leaderboard
Compare safety ratings and risk scores of leading AI models
Rank | Model | Rating | Performance | Performance vs Risk | Risk Score | Source |
---|---|---|---|---|---|---|
1 | gemini-1.5-pro-exp-0801 | 85.90% | 2.77:1(Excellent) | 31 | ||
2 | gemini-1.5-pro-latest | 85.90% | 3.07:1(Excellent) | 28 | ||
3 | gemma-2-27b-it | 75.20% | 2.51:1(Excellent) | 30 | ||
4 | Reflection-Llama-3.1-70B | 79.00% | 2.32:1(Excellent) | 34 | ||
5 | Llama-2-7B-Chat-GGUF-8bit | 45.80% | 1.35:1(Good) | 34 | ||
6 | Llama-2-7B-Chat-GGUF-4bit | 45.80% | 1.35:1(Good) | 34 | ||
7 | SmolLM-360M-Instruct | 34.17% | 0.92:1(Poor) | 37 | ||
8 | llama-2-7b-chat-hf | 47.33% | 1.28:1(Good) | 37 | ||
9 | flan-ul2 | 55.56% | 1.39:1(Good) | 40 | ||
10 | o1-preview | 90.80% | 2.27:1(Excellent) | 40 | ||
11 | Llama-3-8B-Instruct-RR | 68.40% | 1.40:1(Good) | 49 | ||
12 | claude-3-opus-20240229 | 88.20% | 2.00:1(Excellent) | 44 | ||
13 | gpt-4-0125-preview | 86.40% | 1.73:1(Good) | 50 | ||
14 | sarvam-2b-v0.5 | N/A | NA(Poor) | 46 | ||
15 | Llama-3-8B-Instruct-MopeyMule | 68.40% | 1.49:1(Good) | 46 | ||
17 | claude-3-5-sonnet-20240620 | 88.70% | 1.74:1(Good) | 51 | ||
18 | sea-lion-7b-instruct | 26.87% | 0.50:1(Poor) | 54 | ||
19 | claude-instant-1.2 | 73.40% | 1.27:1(Good) | 58 | ||
20 | PowerLM-3b-EAI-Aligned | 31.40% | 0.51:1(Poor) | 61 | ||
21 | gpt-4-turbo-2024-04-09 | 86.67% | 1.49:1(Good) | 58 | ||
22 | Meta-Llama-3.1-8B-Instruct-Turbo | 69.40% | 1.22:1(Good) | 57 | ||
23 | RakutenAI-7B-chat | 60.32% | 1.06:1(Good) | 57 | ||
24 | gemma-2-2b-it | 42.30% | 0.74:1(Poor) | 57 | ||
25 | Meta-Llama-3-8B-Instruct | 66.54% | 1.06:1(Good) | 63 | ||
26 | o1-mini | 85.20% | 1.37:1(Good) | 62 | ||
27 | Mistral-7B-v0.1 | 60.10% | 0.99:1(Poor) | 61 | ||
28 | QwQ-32B-Preview | N/A | NA(Poor) | 60 | ||
29 | Llama-2-13b-chat-hf | 54.80% | 0.84:1(Poor) | 65 | ||
30 | Mistral-7B-Instruct-v0.2-EAI-Aligned | 61.00% | 0.97:1(Poor) | 63 | ||
31 | h2o-danube3-500m-chat | 26.33% | 0.42:1(Poor) | 62 | ||
32 | granite-3.0-1b-a400m-instruct | 32.00% | 0.52:1(Poor) | 61 | ||
34 | mistral.mistral-7b-instruct-v0.2 | 55.40% | 0.76:1(Poor) | 73 | ||
35 | Llama-2-70b-chat-hf | 63.90% | 0.93:1(Poor) | 69 | ||
36 | gemma-2-9b-it | 31.94% | 0.47:1(Poor) | 68 | ||
37 | internlm2-chat-20b | 66.50% | 1.11:1(Good) | 60 | ||
38 | Llama-3.2-1B-instruct | 49.30% | 0.75:1(Poor) | 66 | ||
39 | gemma-2-9b | 71.30% | 1.08:1(Good) | 66 | ||
40 | Llama-3.2-3B-Instruct | 63.40% | 1.01:1(Good) | 63 | ||
41 | NexusRaven-V2-13B | 44.88% | 0.67:1(Poor) | 67 | ||
42 | Qwen2.5-0.5B-Instruct | 24.10% | 0.37:1(Poor) | 66 | ||
43 | Qwen2.5-1.5B-Instruct | 50.70% | 0.78:1(Poor) | 65 | ||
44 | komodo-7b-base | N/A | NA(Poor) | 67 | ||
45 | gpt-4o | 88.70% | 1.25:1(Good) | 71 | ||
46 | phi-2 | 58.40% | 0.90:1(Poor) | 65 | ||
47 | Llama-3.2-11B-Vision-Instruct-Turbo | 73.00% | 0.99:1(Poor) | 74 | ||
48 | phi3-medium-128K | 78.20% | 1.13:1(Good) | 69 | ||
49 | gemma-7b-it | 66.10% | 0.96:1(Poor) | 69 | ||
50 | claude-3-haiku-20240307 | 76.70% | 1.01:1(Good) | 76 | ||
51 | Meta-Llama-3.1-65B-Instruct-Turbo | 88.60% | 1.27:1(Good) | 70 | ||
52 | SmolLM-1.7B-instruct | 39.97% | 0.56:1(Poor) | 71 | ||
53 | Mistral-NaMo-Meitron-8B-Instruct | 70.40% | 1.07:1(Good) | 66 | ||
54 | gpt-4o-2024-08-06 | 88.70% | 1.20:1(Good) | 74 | ||
55 | granite-3.0-2b-a800m-instruct | 50.16% | 0.70:1(Poor) | 72 | ||
56 | amazon.nova-pro-v1.0 | 85.90% | 1.09:1(Good) | 79 | ||
57 | amazon.nova-lite-v1.0 | 80.50% | 0.99:1(Poor) | 81 | ||
58 | Meta-Llama-3-70B-Instruct | 82.00% | 1.05:1(Good) | 78 | ||
59 | Starling-LM-7B-beta-GGUF-4bit | 63.90% | 0.91:1(Poor) | 70 | ||
60 | Smaug-72B-v0.1 | 77.15% | 0.98:1(Poor) | 79 | ||
61 | claude-3-5-haiku-20240122 | 79.63% | 0.98:1(Poor) | 81 | ||
62 | gpt-3.5-turbo | 70.00% | 0.84:1(Poor) | 83 | ||
63 | granite-3.0-8b-instruct | 65.82% | 0.79:1(Poor) | 83 | ||
64 | CodeLlama-7b-instruct-hf | 34.54% | 0.44:1(Poor) | 79 | ||
65 | Smaug-Llama-3-70B-Instruct | 79.20% | 1.00:1(Good) | 79 | ||
66 | Mistral-8x7B-instruct-v0.1 | 70.33% | 0.91:1(Poor) | 77 | ||
67 | jamba-instruct-preview | N/A | NA(Poor) | 74 | ||
68 | Liquid-40B | 78.76% | 1.09:1(Good) | 72 | ||
69 | Mixtral-8x22B-instruct-v0.1 | 77.71% | 1.01:1(Good) | 77 | ||
70 | SeaLM-7B-v2 | 64.90% | 0.76:1(Poor) | 83 | ||
71 | Qwen2-72B-instruct | 82.30% | 1.03:1(Good) | 80 | ||
72 | Qwen1.5-14B-Chat | 68.52% | 0.85:1(Poor) | 81 | ||
73 | Yi-34B-Chat | 75.70% | 0.97:1(Poor) | 78 | ||
74 | Smaug-34B-v0.1 | 77.29% | 0.89:1(Poor) | 87 | ||
75 | c4ai-command-r-plus | 75.70% | 0.97:1(Poor) | 78 | ||
76 | mistral-small-latest | 72.20% | 0.88:1(Poor) | 82 | ||
77 | Qwen2.7B-Instruct | 70.50% | 0.87:1(Poor) | 81 | ||
78 | Qwen2.5-7.2B-instruct | 86.80% | 1.10:1(Good) | 79 | ||
79 | Mistral-7B-Instruct-v0.2-GGUF-4bit | N/A | NA(Poor) | 79 | ||
80 | Meta-Llama-3.1-70B-Instruct-Turbo | 83.36% | 1.04:1(Good) | 80 | ||
81 | K2-Chat | 63.50% | 0.77:1(Poor) | 83 | ||
82 | Phi-3-mini-4k-instruct | 68.80% | 0.83:1(Poor) | 83 | ||
83 | Starling-LM-7B-beta | 63.90% | 0.75:1(Poor) | 85 | ||
84 | OLMo-6-7B-0924-instruct | N/A | NA(Poor) | 83 | ||
85 | Mistral-7B-Instruct-v0.2-GGUF-8bit | N/A | NA(Poor) | 83 | ||
86 | h2o-danube3-4b-chat | 54.74% | 0.67:1(Poor) | 82 | ||
87 | RakutenAI-7B-Instruct | 60.32% | 0.76:1(Poor) | 79 | ||
88 | Mistral-7B-instruct-v0.2 | 60.07% | 0.72:1(Poor) | 84 | ||
89 | granite-3.0-2b-instruct | 56.03% | 0.64:1(Poor) | 87 | ||
90 | jamba-1.5-mini | 69.70% | 0.80:1(Poor) | 87 | ||
91 | aye-23-35B | 58.20% | 0.68:1(Poor) | 86 | ||
92 | jamba-1.5-large | 80.00% | 0.93:1(Poor) | 88 | ||
93 | Phi-3-small-8k-instruct | 71.10% | 0.81:1(Poor) | 88 | ||
94 | Phi-3-small-128k-instruct | 75.30% | 0.87:1(Poor) | 87 | ||
95 | Qwen2.5-5B-Instruct | 64.40% | 0.75:1(Poor) | 86 | ||
96 | Llama-3.1-Nemotron-70B-instruct-hf | 83.51% | 0.95:1(Poor) | 88 | ||
97 | zephyr-7b-beta | 61.07% | 0.72:1(Poor) | 85 | ||
98 | Qwen2.5-7B-instruct | N/A | NA(Poor) | 84 | ||
99 | PowerMoE-3b | 42.80% | 0.48:1(Poor) | 90 | ||
100 | LongWriter-gm4-9b | 58.70% | 0.66:1(Poor) | 89 | ||
101 | Qwen2.5-32B-Instruct | 83.90% | 0.95:1(Poor) | 88 | ||
102 | Mistral-7B-Instruct-v0.1-GGUF-4bit | N/A | NA(Poor) | 83 | ||
103 | snowflake-arctic-instruct | 67.30% | 0.75:1(Poor) | 90 | ||
104 | Qwen2.5-7B-Instruct | 75.40% | 0.84:1(Poor) | 90 | ||
105 | palm-2-chat-bison | 78.30% | 0.92:1(Poor) | 85 | ||
106 | Mistral-7B-instruct-v0.1-GGUF-8bit | N/A | NA(Poor) | 86 | ||
107 | glm-4-9b-chat | 56.60% | 0.63:1(Poor) | 90 | ||
108 | Phi-3-medium-4k-instruct | 78.00% | 0.87:1(Poor) | 90 | ||
109 | aya-23-8B | 48.20% | 0.55:1(Poor) | 87 | ||
110 | Llama-3.1-70B-Instruct-Turbo | 86.00% | 0.95:1(Poor) | 91 | ||
111 | Mistral-7B-instruct-v0.3 | 81.84% | 0.89:1(Poor) | 90 |
What is Risk Score?
The risk score is an average of risk in four categories: jailbreak susceptibility, bias potential, malware presence, and toxicity assessment. Lower the score, lower the risk.
How to read Jailbreak, Bias, Malware and Toxicity Scores?
A Jailbreak score of 18% indicates that 18% of the jailbreak tests successfully breached the LLM.
Data Source: enkryptai.com