Introduction
Cline is an open-source AI coding agent for VS Code that can create files, edit code, run terminal commands, execute bash commands, and more. It acts as an autonomous pair programmer that helps you build applications through natural language conversations. Cline pairs exceptionally well with SaladCloud for several reasons:- Per-hour pricing means no token cost anxiety during extended agentic coding sessions
- OpenAI-compatible API works out of the box with any compatible LLM deployment
- Self-hosted models give you full control over your AI infrastructure
Prerequisites
Before getting started, make sure you have:- A SaladCloud account with an active LLM deployment
- VS Code installed on your machine
- Basic familiarity with VS Code extensions
Step-by-Step Setup
Step 1: Deploy an LLM Recipe on SaladCloud
First, you need to deploy an OpenAI-compatible LLM server on SaladCloud.- Go to the SaladCloud portal and create an account if you do not already have one.
- Create an organization or choose anexisting one, then click “Deploy a container group”.
- Select an LLM recipe. The best fit for the usecase is Qwen3.5-35B-A3B (llama.cpp) recipe. On the recipe page, provide a name for your container group and deploy. The rest is already preconfigured with recommended settings. If needed, you can still open Advanced Settings and adjust parameters or hardware.
- Once deployed, your endpoint will be live and serving an OpenAI-compatible API.
- qwen3.5-35B-A3B — A powerful Mixture of Experts model optimized for instruction-following tasks, ideal for agentic use cases.
- qwen3.5-9b-llama-cpp — Optimized for Qwen3.5 9b model
- llama.cpp — Supports GGUF models
- sglang — High-performance inference
- vllm — Popular LLM serving framework
- ollama — Simple model management
- tgi — Hugging Face Text Generation Inference server
- API endpoint URL (e.g.,
https://your-endpoint.salad.cloud) - API key (from your SaladCloud account if you enabled authentication)
Step 2: Install Cline in VS Code
- Open VS Code and go to the Extensions view (
Ctrl+Shift+XorCmd+Shift+Xon macOS) - Search for “Cline” or use the extension ID:
saoudrizwan.claude-dev - Click Install on the Cline extension
- Once installed, click the Cline icon in the sidebar to open the Cline panel
Step 3: Configure Cline to Use Your SaladCloud Endpoint
- In the Cline sidebar, click the gear icon (⚙️) to open settings
- Set API Provider to
OpenAI Compatible - Configure the following settings:
- Base URL: Your SaladCloud endpoint (e.g.,
https://your-endpoint.salad.cloud/v1) - API Key: It is required, but can be a dummy value.
- Model ID: The model name from your deployment (e.g.,
qwen3.5-35b-a3b)
- Base URL: Your SaladCloud endpoint (e.g.,
- If you enabled authentication on your SaladCloud deployment, add a custom header: Header name: Salad-Api-Key; Header value: your Salad API key
- Use a frontier model (like Claude or GPT) for planning and your SaladCloud model for execution
- Use the same SaladCloud model for both modes to keep costs low. We found that even with a 35B model, performance was strong enough to handle both planning and acting effectively.
Step 4: Test the Connection
Start with a simple task to verify everything is working:“Create a hello world Python script that prints ‘Hello from SaladCloud!’”If Cline successfully creates and runs the file, your setup is complete!
Tips for Best Results
Use Incremental Prompting for Complex Tasks
When working with self-hosted models that may have request timeouts (current timeout on SaladCloud is 100 seconds), tell Cline to work in small steps and save after each one. This approach:- Reduces the chance of interrupted generations
- Makes it easier to recover from errors
- Helps maintain context across multiple interactions
- Helps debugging after each step
Set a Generous Context Window
Configure your LLM server with a context window of at least 16384 tokens, ideally 32768 or higher. A larger context window helps Cline:- Remember previous interactions
- Understand larger codebases
- Maintain coherence across complex tasks
Model Recommendations
- Qwen 3.5-35B-A3B: Excellent balance of capability and cost for agentic coding
This documentation page was created using Cline powered by an LLM model running on SaladCloud.