Discussion about this post

User's avatar
Are's avatar

This was a very informative post!

Expand full comment
walpurgisnacht's avatar

Hey Alex, thanks for the post! One question:

Do you know the best practices to scale and deploy ONNXRuntime? I recently tried scaling it by using Ray Serve to automatically create replica of my service, but despite allocating certain number of CPU cores for each replica, they all suffer from contention. Is it not the way to go, but instead one deployment per one machine? on CPU it seems to just use up every cores it can access, despite being limited / pinned to certain cpu cores

Expand full comment
1 more comment...

No posts