diff --git a/serverless/load-balancing/overview.mdx b/serverless/load-balancing/overview.mdx index 746b819c..97fe5157 100644 --- a/serverless/load-balancing/overview.mdx +++ b/serverless/load-balancing/overview.mdx @@ -20,6 +20,17 @@ When you're ready to get started, follow this tutorial to learn how to [build an Or, if you're ready for a more advanced use case, you can jump straight into [building a vLLM load balancer](/serverless/load-balancing/vllm-worker). +You can also watch this video for an brief overview of the concepts explained on this page: + + + ## Key features - **Direct HTTP access**: Connect directly to worker HTTP servers, bypassing queue infrastructure for lower latency.