VetLLM Leaderboard

VetLLM evaluates LLMs with veterinary and medical programming tasks.

Average Quality Score

543210

3.17

3.07

2.97

2.9

2.41

Mistral-7B-Instruct-v0.2

Llama-3.1-8B-Instruct

openchat-3.5-1210

vetllm-mistral-7b-merged-pmc2

gemma-7b

#	Model	Pass@1
1	🧠mistralai/Mistral-7B-Instruct-v0.2 Instruction-tuned Mistral 7B model	3.17
2	🧠meta-llama/Llama-3.1-8B-Instruct Instruction-tuned LLaMA 3.1 (8B)	3.07
3	🧠openchat/openchat-3.5-1210 Strong 7B conversational model	2.97
4	🧠huang342/vetllm-mistral-7b-merged-pmc2 Fine-tuned Mistral 7B for veterinary tasks	2.9
5	🧠google/gemma-7b Official 7B model from Google's Gemma family	2.41