Skip to main content
Last Updated: April 1, 2026

Introduction

Kilo Code is an open-source AI coding agent for VS Code (and JetBrains) that can create files, edit code, run terminal commands, and work across your entire codebase through natural language. It ships with specialized modes - Architect, Coder, Debugger - and supports MCP servers for extended tool capabilities. Kilo Code pairs well with SaladCloud because:
  • Custom header support - Kilo Code has a dedicated Custom Headers field, so you can pass Salad-Api-Key for authenticated SaladCloud deployments without any workarounds
  • OpenAI-compatible API - built-in “OpenAI Compatible” provider option works directly with any SaladCloud endpoint
  • Per-hour pricing - no per-token cost anxiety during extended agentic coding sessions

Prerequisites

Before getting started, make sure you have:

Step-by-Step Setup

Step 1: Deploy an LLM Recipe on SaladCloud

First, deploy an OpenAI-compatible LLM server on SaladCloud.
  • Go to the SaladCloud portal and create an account if you do not already have one.
  • Create an organization or choose an existing one, then click “Deploy a container group”.
  • Select an LLM recipe. The best fit for agentic coding is the Qwen3.5-35B-A3B (llama.cpp) recipe. On the recipe page, provide a name and deploy - the rest is preconfigured with recommended settings.
  • Once deployed, your endpoint will be live and serving an OpenAI-compatible API.
Available recipes: Ready-to-deploy recipes (best for less technical users):
  • qwen3.5-35B-A3B - A powerful Mixture of Experts model optimized for instruction-following tasks, ideal for agentic use cases.
  • qwen3.5-9b-llama-cpp - Optimized for Qwen3.5 9B model.
Recipes for custom deployments (best for advanced users):
  • llama.cpp - Supports GGUF models
  • sglang - High-performance inference
  • vllm - Popular LLM serving framework
  • ollama - Simple model management
  • tgi - Hugging Face Text Generation Inference server
After deployment, note your:
  • API endpoint URL (e.g., https://your-endpoint.salad.cloud)
  • API key (from your SaladCloud organization settings, if you enabled authentication)

Step 2: Install Kilo Code in VS Code

  1. Open VS Code and go to the Extensions view (Ctrl+Shift+X or Cmd+Shift+X on macOS)
  2. Search for Kilo Code
  3. Click Install on the Kilo Code extension by Kilo-Org
  4. Once installed, click the Kilo Code icon in the sidebar to open the panel

Step 3: Configure Kilo Code to Use Your SaladCloud Endpoint

  1. In the Kilo Code panel, click the gear icon to open provider settings
  2. Click Provider and under Custom Provider click Connect
  3. Set Provider ID and Display Name to something like saladcloud
  4. Set Base URL to your SaladCloud endpoint: https://your-endpoint.salad.cloud/v1
  5. Set API Key to any non-empty string (e.g., dummy) - authentication is handled via a custom header below
  6. Set Model ID to the model name from your deployment (e.g., qwen3.5-35b-a3b)
  7. If your deployment has authentication enabled, expand Custom Headers and add:
    • Header name: Salad-Api-Key
    • Header value: your SaladCloud API key

Step 4: Test the Connection

Start with a simple task to verify everything is working:
“Create a hello world Python script that prints ‘Hello from SaladCloud!’”
If Kilo Code successfully creates and runs the file, your setup is complete.

Tips for Best Results

Choose between plan/debug/code/ask modes based on your task:

  • Plan: Best for high-level design and architecture tasks
  • Debug: Ideal for fixing bugs and improving existing code
  • Code: Great for generating new code and files
  • Ask: Use for quick questions and explanations about your codebase

Use Incremental Prompting for Complex Tasks

SaladCloud has a 100-second request timeout. For large multi-file tasks, break them into smaller steps and instruct Kilo Code to save after each one:
“Important: work incrementally. Create and save each file one at a time before moving on to the next.”

Model Recommendations

  • Qwen 3.5-35B-A3B: Best balance of capability and cost for agentic coding tasks
  • Qwen 3.5-9B: Suitable for simpler tasks; may struggle with complex multi-file edits