Inference Service Performance Report

This report summarizes the performance of different inferencing services based on 10 iterations per service. Lower 'Time (seconds)' indicates better performance. Individual responses are linked in the 'Raw Data' table below.

Summary Statistics

Service mean median min max std
Baseten - DeepSeek-R1-0528 23.9749 27.6875 15.4525 30.7126 7.0924
Weave - DeepSeek-R1-0528 372.4414 159.9337 48.6131 2501.7432 750.1057
Fireworks - Deepseek-R1-0528 20.1104 19.4037 17.1911 25.4899 2.5379
Baseten - LLlama-4-Scout-17B-16E-Instruct 19.9917 13.5645 8.3556 51.0681 16.2172
Weave - LLlama-4-Scout-17B-16E-Instruct 19.2151 14.6419 5.4005 40.5950 13.3889
Fireworks - LLlama-4-Scout 21.6216 18.4975 13.5578 36.0470 8.6606
Baseten - Llama 3.3 70B Instruct(dedicated) 7.4706 7.4180 5.6038 9.9364 1.2497
Weave - Llama 3.3 70B Instruct 10.5572 10.6751 7.1099 12.7937 1.7103
Fireworks - Llama 3.3 70B Instruct 9.7568 8.5045 6.2911 16.9184 3.1645

Raw Data (All Iterations)

Service Iteration Time (seconds) Status Response
Weave - DeepSeek-R1-0528 1 163.871312 Success View Response
Weave - DeepSeek-R1-0528 2 168.336908 Success View Response
Weave - DeepSeek-R1-0528 3 195.738950 Success View Response
Weave - DeepSeek-R1-0528 4 2501.743246 Success View Response
Weave - DeepSeek-R1-0528 5 204.398561 Success View Response
Weave - DeepSeek-R1-0528 6 155.996049 Success View Response
Weave - DeepSeek-R1-0528 7 51.224192 Success View Response
Weave - DeepSeek-R1-0528 8 106.402511 Success View Response
Weave - DeepSeek-R1-0528 9 128.089077 Success View Response
Weave - DeepSeek-R1-0528 10 48.613137 Success View Response
Fireworks - Deepseek-R1-0528 1 19.041908 Success View Response
Fireworks - Deepseek-R1-0528 2 18.074202 Success View Response
Fireworks - Deepseek-R1-0528 3 18.899383 Success View Response
Fireworks - Deepseek-R1-0528 4 18.690873 Success View Response
Fireworks - Deepseek-R1-0528 5 20.037047 Success View Response
Fireworks - Deepseek-R1-0528 6 17.191129 Success View Response
Fireworks - Deepseek-R1-0528 7 25.489910 Success View Response
Fireworks - Deepseek-R1-0528 8 19.765477 Success View Response
Fireworks - Deepseek-R1-0528 9 20.408438 Success View Response
Fireworks - Deepseek-R1-0528 10 23.505495 Success View Response
Baseten - DeepSeek-R1-0528 1 29.608570 Success View Response
Baseten - DeepSeek-R1-0528 2 15.452457 Success View Response
Baseten - DeepSeek-R1-0528 3 26.857682 Success View Response
Baseten - DeepSeek-R1-0528 4 15.991604 Success View Response
Baseten - DeepSeek-R1-0528 5 28.517257 Success View Response
Baseten - DeepSeek-R1-0528 6 16.443043 Success View Response
Baseten - DeepSeek-R1-0528 7 30.712609 Error: peer closed connection without sending complete message body (incomplete chunked read) View Response
Baseten - DeepSeek-R1-0528 8 30.551272 Error: peer closed connection without sending complete message body (incomplete chunked read) View Response
Baseten - DeepSeek-R1-0528 9 30.141255 Success View Response
Baseten - DeepSeek-R1-0528 10 15.473447 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 1 7.518848 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 2 35.711967 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 3 13.770505 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 4 9.554744 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 5 5.400502 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 6 15.513268 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 7 10.529130 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 8 16.056367 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 9 37.501047 Success View Response
Weave - LLlama-4-Scout-17B-16E-Instruct 10 40.594955 Success View Response
Fireworks - LLlama-4-Scout 1 14.125268 Success View Response
Fireworks - LLlama-4-Scout 2 36.047022 Success View Response
Fireworks - LLlama-4-Scout 3 14.027516 Success View Response
Fireworks - LLlama-4-Scout 4 19.287712 Success View Response
Fireworks - LLlama-4-Scout 5 13.557789 Success View Response
Fireworks - LLlama-4-Scout 6 17.707228 Success View Response
Fireworks - LLlama-4-Scout 7 34.481665 Success View Response
Fireworks - LLlama-4-Scout 8 14.579795 Success View Response
Fireworks - LLlama-4-Scout 9 28.479873 Success View Response
Fireworks - LLlama-4-Scout 10 23.921763 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 1 17.627247 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 2 9.254239 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 3 12.364650 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 4 10.906053 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 5 8.380624 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 6 14.764319 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 7 8.355639 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 8 51.068084 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 9 48.957368 Success View Response
Baseten - LLlama-4-Scout-17B-16E-Instruct 10 18.238565 Success View Response
Weave - Llama 3.3 70B Instruct 1 12.793663 Success View Response
Weave - Llama 3.3 70B Instruct 2 9.240220 Success View Response
Weave - Llama 3.3 70B Instruct 3 12.158395 Success View Response
Weave - Llama 3.3 70B Instruct 4 11.058180 Success View Response
Weave - Llama 3.3 70B Instruct 5 9.190049 Success View Response
Weave - Llama 3.3 70B Instruct 6 7.109860 Success View Response
Weave - Llama 3.3 70B Instruct 7 12.253823 Success View Response
Weave - Llama 3.3 70B Instruct 8 10.457484 Success View Response
Weave - Llama 3.3 70B Instruct 9 10.892732 Success View Response
Weave - Llama 3.3 70B Instruct 10 10.417541 Success View Response
Fireworks - Llama 3.3 70B Instruct 1 8.616396 Success View Response
Fireworks - Llama 3.3 70B Instruct 2 11.755364 Success View Response
Fireworks - Llama 3.3 70B Instruct 3 16.918421 Success View Response
Fireworks - Llama 3.3 70B Instruct 4 7.386170 Success View Response
Fireworks - Llama 3.3 70B Instruct 5 8.276100 Success View Response
Fireworks - Llama 3.3 70B Instruct 6 7.271913 Success View Response
Fireworks - Llama 3.3 70B Instruct 7 8.392622 Success View Response
Fireworks - Llama 3.3 70B Instruct 8 10.725021 Success View Response
Fireworks - Llama 3.3 70B Instruct 9 6.291070 Success View Response
Fireworks - Llama 3.3 70B Instruct 10 11.935140 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 1 9.936445 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 2 6.969886 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 3 6.795108 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 4 5.603817 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 5 8.350239 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 6 7.862653 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 7 8.285197 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 8 7.747568 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 9 6.066811 Success View Response
Baseten - Llama 3.3 70B Instruct(dedicated) 10 7.088359 Success View Response