Inference Service Performance Report

This report summarizes the performance of different inferencing services based on 10 iterations per service. Lower 'Time (seconds)' indicates better performance. Individual responses are linked in the 'Raw Data' table below.

Summary Statistics

Service mean median min max std
Baseten - DeepSeek-R1-0528 21.2427 20.0956 10.7770 30.7589 6.7810
Weave - DeepSeek-R1-0528 78.2714 76.1066 41.3626 110.4436 21.1934
Fireworks - Deepseek-R1-0528 23.0581 22.0591 18.5315 33.8149 4.2530
Baseten - LLlama-4-Scout-17B-16E-Instruct 8.9906 9.1114 7.3495 10.3948 1.0285
Weave - LLlama-4-Scout-17B-16E-Instruct 7.6670 7.6812 6.8212 8.4229 0.4145
Fireworks - LLlama-4-Scout 17.7493 18.9464 9.0944 23.9553 4.1787
Baseten - Llama 3.3 70B Instruct(dedicated) 13.8940 12.9122 8.3801 25.3116 4.9435
Weave - Llama 3.3 70B Instruct 11.1982 11.4788 7.2039 17.1535 2.9830
Fireworks - Llama 3.3 70B Instruct 12.5023 11.4204 7.9246 23.8761 4.8426

Raw Data (All Iterations)

Service Iteration Time (seconds) Status Response
Weave - DeepSeek-R1-0528 1 77.607614 Success View Response
Weave - DeepSeek-R1-0528 2 65.789246 Success View Response
Weave - DeepSeek-R1-0528 3 74.605491 Success View Response
Weave - DeepSeek-R1-0528 4 67.431015 Success View Response
Weave - DeepSeek-R1-0528 5 91.453860 Success View Response
Weave - DeepSeek-R1-0528 6 94.830546 Success View Response
Weave - DeepSeek-R1-0528 7 41.362581 Success View Response
Weave - DeepSeek-R1-0528 8 100.883842 Success View Response
Weave - DeepSeek-R1-0528 9 110.443568 Success View Response
Weave - DeepSeek-R1-0528 10 58.306146 Success View Response
Fireworks - Deepseek-R1-0528 1 25.528948 Success View Response
Fireworks - Deepseek-R1-0528 2 19.957602 Success View Response
Fireworks - Deepseek-R1-0528 3 21.371792 Success View Response
Fireworks - Deepseek-R1-0528 4 20.826692 Success View Response
Fireworks - Deepseek-R1-0528 5 21.503663 Success View Response
Fireworks - Deepseek-R1-0528 6 23.655512 Success View Response
Fireworks - Deepseek-R1-0528 7 33.814893 Success View Response
Fireworks - Deepseek-R1-0528 8 22.614523 Success View Response
Fireworks - Deepseek-R1-0528 9 18.531499 Success View Response
Fireworks - Deepseek-R1-0528 10 22.776215 Success View Response
Baseten - DeepSeek-R1-0528 1 27.204021 Success View Response
Baseten - DeepSeek-R1-0528 2 18.402210 Success View Response
Baseten - DeepSeek-R1-0528 3 30.758941 Error: peer closed connection without sending complete message body (incomplete chunked read) View Response
Baseten - DeepSeek-R1-0528 4 19.241940 Success View Response
Baseten - DeepSeek-R1-0528 5 20.949310 Success View Response
Baseten - DeepSeek-R1-0528 6 26.965511 Success View Response
Baseten - DeepSeek-R1-0528 7 12.004754 Success View Response
Baseten - DeepSeek-R1-0528 8 18.515195 Success View Response
Baseten - DeepSeek-R1-0528 9 27.608304 Success View Response
Baseten - DeepSeek-R1-0528 10 10.776964 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 1 7.670444 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 2 7.680871 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 3 7.419958 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 4 7.974426 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 5 7.409096 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 6 7.843427 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 7 7.681495 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 8 8.422859 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 9 6.821188 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 10 7.746435 Success View Response
Fireworks - LLlama-4-Scout 1 15.914570 Success View Response
Fireworks - LLlama-4-Scout 2 23.955314 Success View Response
Fireworks - LLlama-4-Scout 3 20.343727 Success View Response
Fireworks - LLlama-4-Scout 4 14.856477 Success View Response
Fireworks - LLlama-4-Scout 5 20.549456 Success View Response
Fireworks - LLlama-4-Scout 6 18.225799 Success View Response
Fireworks - LLlama-4-Scout 7 19.666962 Success View Response
Fireworks - LLlama-4-Scout 8 20.024921 Success View Response
Fireworks - LLlama-4-Scout 9 14.860958 Success View Response
Fireworks - LLlama-4-Scout 10 9.094438 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 1 7.349474 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 2 9.107785 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 3 9.913900 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 4 7.739148 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 5 10.394764 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 6 7.885246 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 7 9.589278 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 8 9.114995 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 9 9.897226 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 10 8.914047 Success View Response
Weave - Llama 3.3 70B Instruct 1 8.483661 Success View Response
Weave - Llama 3.3 70B Instruct 2 12.803517 Success View Response
Weave - Llama 3.3 70B Instruct 3 13.337005 Success View Response
Weave - Llama 3.3 70B Instruct 4 10.036623 Success View Response
Weave - Llama 3.3 70B Instruct 5 17.153497 Success View Response
Weave - Llama 3.3 70B Instruct 6 7.203936 Success View Response
Weave - Llama 3.3 70B Instruct 7 7.878391 Success View Response
Weave - Llama 3.3 70B Instruct 8 10.865139 Success View Response
Weave - Llama 3.3 70B Instruct 9 12.127775 Success View Response
Weave - Llama 3.3 70B Instruct 10 12.092538 Success View Response
Fireworks - Llama 3.3 70B Instruct 1 7.942493 Success View Response
Fireworks - Llama 3.3 70B Instruct 2 7.924613 Success View Response
Fireworks - Llama 3.3 70B Instruct 3 9.195469 Success View Response
Fireworks - Llama 3.3 70B Instruct 4 11.269665 Success View Response
Fireworks - Llama 3.3 70B Instruct 5 23.876138 Success View Response
Fireworks - Llama 3.3 70B Instruct 6 17.388098 Success View Response
Fireworks - Llama 3.3 70B Instruct 7 12.553384 Success View Response
Fireworks - Llama 3.3 70B Instruct 8 11.571047 Success View Response
Fireworks - Llama 3.3 70B Instruct 9 12.457366 Success View Response
Fireworks - Llama 3.3 70B Instruct 10 10.844386 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 1 14.991033 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 2 25.311611 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 3 16.725443 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 4 13.081828 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 5 12.742473 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 6 8.380108 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 7 8.982810 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 8 9.934564 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 9 12.436285 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 10 16.353618 Success View Response