Inference Service Performance Report

This report summarizes the performance of different inferencing services based on 10 iterations per service. Lower 'Time (seconds)' indicates better performance. Individual responses are linked in the 'Raw Data' table below.

Summary Statistics

Service mean median min max std
Baseten - DeepSeek-R1-0528 2.7249 2.7107 2.3285 3.8161 0.4150
Weave - DeepSeek-R1-0528 9.5287 7.3571 4.0287 32.1302 8.1997
Fireworks - Deepseek-R1-0528 7.5948 5.8526 3.6374 22.4016 5.5851
Baseten - LLlama-4-Scout-17B-16E-Instruct 2.0517 1.8564 1.3358 3.2320 0.6922
Weave - LLlama-4-Scout-17B-16E-Instruct 1.6158 1.5887 1.2805 1.9836 0.2446
Fireworks - LLlama-4-Scout 5.8559 4.7883 3.2109 11.9440 2.9255
Baseten - Llama 3.3 70B Instruct(dedicated) 4.3471 3.5689 2.8424 8.5987 2.1333
Weave - Llama 3.3 70B Instruct 3.9393 3.8101 2.6556 6.5975 1.1653
Fireworks - Llama 3.3 70B Instruct 3.4509 3.2293 2.9148 4.2511 0.5377

Raw Data (All Iterations)

Service Iteration Time (seconds) Status Response
Weave - DeepSeek-R1-0528 1 7.965689 Success View Response
Weave - DeepSeek-R1-0528 2 6.748478 Success View Response
Weave - DeepSeek-R1-0528 3 5.937550 Success View Response
Weave - DeepSeek-R1-0528 4 8.037296 Success View Response
Weave - DeepSeek-R1-0528 5 9.158671 Success View Response
Weave - DeepSeek-R1-0528 6 32.130209 Success View Response
Weave - DeepSeek-R1-0528 7 4.028681 Success View Response
Weave - DeepSeek-R1-0528 8 4.831923 Success View Response
Weave - DeepSeek-R1-0528 9 10.788374 Success View Response
Weave - DeepSeek-R1-0528 10 5.660377 Success View Response
Fireworks - Deepseek-R1-0528 1 5.604734 Success View Response
Fireworks - Deepseek-R1-0528 2 6.560949 Success View Response
Fireworks - Deepseek-R1-0528 3 10.809269 Success View Response
Fireworks - Deepseek-R1-0528 4 4.336607 Success View Response
Fireworks - Deepseek-R1-0528 5 3.637422 Success View Response
Fireworks - Deepseek-R1-0528 6 4.295240 Success View Response
Fireworks - Deepseek-R1-0528 7 6.100389 Success View Response
Fireworks - Deepseek-R1-0528 8 5.033307 Success View Response
Fireworks - Deepseek-R1-0528 9 7.168240 Success View Response
Fireworks - Deepseek-R1-0528 10 22.401623 Success View Response
Baseten - DeepSeek-R1-0528 1 2.603456 Success View Response
Baseten - DeepSeek-R1-0528 2 2.328475 Success View Response
Baseten - DeepSeek-R1-0528 3 2.730770 Success View Response
Baseten - DeepSeek-R1-0528 4 2.450491 Success View Response
Baseten - DeepSeek-R1-0528 5 3.816073 Success View Response
Baseten - DeepSeek-R1-0528 6 2.393692 Success View Response
Baseten - DeepSeek-R1-0528 7 2.697574 Success View Response
Baseten - DeepSeek-R1-0528 8 2.723776 Success View Response
Baseten - DeepSeek-R1-0528 9 2.759117 Success View Response
Baseten - DeepSeek-R1-0528 10 2.745591 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 1 1.774189 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 2 1.552761 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 3 1.930607 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 4 1.753550 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 5 1.554644 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 6 1.326332 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 7 1.622758 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 8 1.378787 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 9 1.983553 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 10 1.280508 Success View Response
Fireworks - LLlama-4-Scout 1 4.220811 Success View Response
Fireworks - LLlama-4-Scout 2 4.762557 Success View Response
Fireworks - LLlama-4-Scout 3 5.826673 Success View Response
Fireworks - LLlama-4-Scout 4 6.214291 Success View Response
Fireworks - LLlama-4-Scout 5 11.943955 Success View Response
Fireworks - LLlama-4-Scout 6 4.814047 Success View Response
Fireworks - LLlama-4-Scout 7 3.210945 Success View Response
Fireworks - LLlama-4-Scout 8 10.160783 Success View Response
Fireworks - LLlama-4-Scout 9 3.502886 Success View Response
Fireworks - LLlama-4-Scout 10 3.902426 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 1 3.232032 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 2 1.695624 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 3 1.364571 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 4 2.028640 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 5 1.600781 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 6 3.111151 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 7 1.575692 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 8 2.555748 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 9 1.335782 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 10 2.017201 Success View Response
Weave - Llama 3.3 70B Instruct 1 3.967659 Success View Response
Weave - Llama 3.3 70B Instruct 2 4.741421 Success View Response
Weave - Llama 3.3 70B Instruct 3 3.652572 Success View Response
Weave - Llama 3.3 70B Instruct 4 6.597540 Success View Response
Weave - Llama 3.3 70B Instruct 5 4.024214 Success View Response
Weave - Llama 3.3 70B Instruct 6 2.655629 Success View Response
Weave - Llama 3.3 70B Instruct 7 4.547329 Success View Response
Weave - Llama 3.3 70B Instruct 8 2.948005 Success View Response
Weave - Llama 3.3 70B Instruct 9 3.364616 Success View Response
Weave - Llama 3.3 70B Instruct 10 2.894486 Success View Response
Fireworks - Llama 3.3 70B Instruct 1 4.251075 Success View Response
Fireworks - Llama 3.3 70B Instruct 2 3.031643 Success View Response
Fireworks - Llama 3.3 70B Instruct 3 3.659689 Success View Response
Fireworks - Llama 3.3 70B Instruct 4 4.024755 Success View Response
Fireworks - Llama 3.3 70B Instruct 5 2.932533 Success View Response
Fireworks - Llama 3.3 70B Instruct 6 3.214019 Success View Response
Fireworks - Llama 3.3 70B Instruct 7 4.210537 Success View Response
Fireworks - Llama 3.3 70B Instruct 8 3.244632 Success View Response
Fireworks - Llama 3.3 70B Instruct 9 2.914787 Success View Response
Fireworks - Llama 3.3 70B Instruct 10 3.024999 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 1 8.598735 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 2 8.042381 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 3 3.970380 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 4 2.842442 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 5 3.065672 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 6 3.796519 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 7 2.944816 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 8 3.375656 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 9 3.762066 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 10 3.072570 Success View Response