Introduction
Kilo Code is an open-source AI coding agent for VS Code (and JetBrains) that can create files, edit code, run terminal commands, and work across your entire codebase through natural language. It ships with specialized modes - Architect, Coder, Debugger - and supports MCP servers for extended tool capabilities. Kilo Code pairs well with SaladCloud because:- Custom header support - Kilo Code has a dedicated Custom Headers field, so you can pass
Salad-Api-Keyfor authenticated SaladCloud deployments without any workarounds - OpenAI-compatible API - built-in “OpenAI Compatible” provider option works directly with any SaladCloud endpoint
- Per-hour pricing - no per-token cost anxiety during extended agentic coding sessions
Prerequisites
Before getting started, make sure you have:- A SaladCloud account
- VS Code installed on your machine
Step-by-Step Setup
Step 1: Deploy an LLM Recipe on SaladCloud
First, deploy an OpenAI-compatible LLM server on SaladCloud.- Go to the SaladCloud portal and create an account if you do not already have one.
- Create an organization or choose an existing one, then click “Deploy a container group”.
- Select an LLM recipe. The best fit for agentic coding is the Qwen3.5-35B-A3B (llama.cpp) recipe. On the recipe page, provide a name and deploy - the rest is preconfigured with recommended settings.
- Once deployed, your endpoint will be live and serving an OpenAI-compatible API.
- qwen3.5-35B-A3B - A powerful Mixture of Experts model optimized for instruction-following tasks, ideal for agentic use cases.
- qwen3.5-9b-llama-cpp - Optimized for Qwen3.5 9B model.
- llama.cpp - Supports GGUF models
- sglang - High-performance inference
- vllm - Popular LLM serving framework
- ollama - Simple model management
- tgi - Hugging Face Text Generation Inference server
- API endpoint URL (e.g.,
https://your-endpoint.salad.cloud) - API key (from your SaladCloud organization settings, if you enabled authentication)
Step 2: Install Kilo Code in VS Code
- Open VS Code and go to the Extensions view (
Ctrl+Shift+XorCmd+Shift+Xon macOS) - Search for Kilo Code
- Click Install on the Kilo Code extension by Kilo-Org
- Once installed, click the Kilo Code icon in the sidebar to open the panel
Step 3: Configure Kilo Code to Use Your SaladCloud Endpoint
- In the Kilo Code panel, click the gear icon to open provider settings
- Click Provider and under
Custom Providerclick Connect - Set Provider ID and
Display Nameto something likesaladcloud - Set Base URL to your SaladCloud endpoint:
https://your-endpoint.salad.cloud/v1 - Set API Key to any non-empty string (e.g.,
dummy) - authentication is handled via a custom header below - Set Model ID to the model name from your deployment (e.g.,
qwen3.5-35b-a3b) - If your deployment has authentication enabled, expand Custom Headers and add:
- Header name:
Salad-Api-Key - Header value: your SaladCloud API key
- Header name:
Step 4: Test the Connection
Start with a simple task to verify everything is working:“Create a hello world Python script that prints ‘Hello from SaladCloud!’”If Kilo Code successfully creates and runs the file, your setup is complete.
Tips for Best Results
Choose between plan/debug/code/ask modes based on your task:
- Plan: Best for high-level design and architecture tasks
- Debug: Ideal for fixing bugs and improving existing code
- Code: Great for generating new code and files
- Ask: Use for quick questions and explanations about your codebase
Use Incremental Prompting for Complex Tasks
SaladCloud has a 100-second request timeout. For large multi-file tasks, break them into smaller steps and instruct Kilo Code to save after each one:“Important: work incrementally. Create and save each file one at a time before moving on to the next.”
Model Recommendations
- Qwen 3.5-35B-A3B: Best balance of capability and cost for agentic coding tasks
- Qwen 3.5-9B: Suitable for simpler tasks; may struggle with complex multi-file edits