Use Continue with SaladCloud

Last Updated: May 18, 2026

Introduction

Continue is an open-source AI coding assistant for VS Code and JetBrains. Unlike pure agentic tools, Continue focuses on the full coding workflow: autocomplete, inline chat, codebase-aware Q&A, and multi-file edits. Supports any OpenAI-compatible endpoint with custom headers. Continue works with SaladCloud in two ways:

Salad AI Gateway - no infrastructure to deploy or manage. Sign up for access, point Continue at a single shared endpoint, and use your Salad API key directly. Currently in closed beta with monthly flat-rate access.
Self-hosted model - deploy your own SaladCloud container group, billed per hour, for full control over the model, hardware, and configuration. Still very easy to set up and use.

Prerequisites

Before getting started, make sure you have:

A SaladCloud account
VS Code or JetBrains IDE installed

Step-by-Step Setup

Step 1: Choose Your Backend

Salad AI Gateway
Self-Hosted on SaladCloud

Salad AI Gateway is the fastest way to get started - no container groups to deploy, no cold starts to wait for.

Sign up for early access at salad.com/ai-gateway.
Once approved, find your Salad API key in the portal.

Available models:

Model	Description
`qwen3.6-35b-a3b`	Qwen 3.6 35B-A3B - best for agentic tasks, coding, and complex reasoning
`qwen3.6-27b`	Qwen 3.6 27B - strong balance of capability and speed
`qwen3.5-9b`	Qwen 3.5 9B - fastest response times, suited for lighter tasks

First, deploy an OpenAI-compatible LLM server on SaladCloud.

Go to the SaladCloud portal and create an account if you do not already have one.
Create an organization or choose an existing one, then click “Deploy a container group”.
Select an LLM recipe. The Qwen3.6-35B-A3B (llama.cpp) recipe works well for coding assistance. On the recipe page, provide a name and deploy - the rest is preconfigured with recommended settings.
Once deployed, your endpoint will be live and serving an OpenAI-compatible API.

Available recipes:Ready-to-deploy recipes (best for less technical users):

qwen3.6-35B-A3B - A powerful Mixture of Experts model optimized for instruction-following tasks.
qwen3.5-9b-llama-cpp - Optimized for Qwen3.5 9B model.

Recipes for custom deployments (best for advanced users):

llama.cpp - Supports GGUF models
sglang - High-performance inference
vllm - Popular LLM serving framework
ollama - Simple model management
tgi - Hugging Face Text Generation Inference server

After deployment, note your:

API endpoint URL (e.g., https://your-endpoint.salad.cloud)
API key (from your SaladCloud organization settings, if you enabled authentication)

Step 2: Install Continue

VS Code:

Open the Extensions view (Ctrl+Shift+X or Cmd+Shift+X)
Search for Continue
Click Install on the Continue extension by Continue Dev
Once installed, click the Continue icon in the sidebar

JetBrains: Install the Continue plugin from the JetBrains Marketplace via Settings → Plugins → Marketplace.

Step 3: Configure Continue to Use Your SaladCloud Endpoint

Continue is configured via ~/.continue/config.yaml. Open it from the Continue panel by clicking the gear icon, or edit it directly. The configuration UI does not currently support all necessary fields, so you will need to add the model config manually in the YAML file.

Salad AI Gateway
Self-Hosted on SaladCloud

Add your AI Gateway model:

models:
  - name: Salad AI Gateway
    provider: openai
    model: qwen3.6-35b-a3b
    apiBase: https://ai.salad.cloud/v1
    apiKey: your-salad-api-key

No custom headers are needed - your Salad API key in apiKey is all that’s required.

Add your SaladCloud deployment as a model:

models:
  - name: SaladCloud Qwen 3.6
    provider: openai
    model: qwen3.6-35b-a3b
    apiBase: https://your-endpoint.salad.cloud/v1
    apiKey: dummy
    requestOptions:
      headers:
        Salad-Api-Key: your-salad-api-key

Replace https://your-endpoint.salad.cloud/v1 with your actual endpoint URL.If your deployment does not require authentication, omit the requestOptions section entirely and set apiKey to any non-empty string.To also use your SaladCloud model for autocomplete, add :

roles:
  - autocomplete

Step 4: Test the Connection

Reload VS Code (Ctrl+Shift+P → Reload Window), then open the Continue chat panel and ask:

“Summarize what this project does in one sentence.”

If Continue responds, your setup is complete.

Tips for Best Results

Set Context Window

Add contextLength to your model config to help Continue manage prompt sizes correctly:

models:
  - name: SaladCloud Qwen 3.6
    provider: openai
    model: qwen3.6-35b-a3b
    apiBase: https://your-endpoint.salad.cloud/v1
    apiKey: dummy
    roles:
      - chat
      - autocomplete
    contextLength: 262144
    requestOptions:
      headers:
        Salad-Api-Key: your-salad-api-key

The Qwen 3.6-35B-A3B recipe supports up to 262,144 tokens.

Handle Autocomplete Timeouts

SaladCloud has a 100-second request timeout. Autocomplete requests are short and typically complete well within this limit, but for large context windows, consider setting a shorter maxTokens for autocomplete to reduce latency:

tabAutocompleteModel:
  name: SaladCloud Qwen 3.6
  provider: openai
  model: qwen3.6-35b-a3b
  apiBase: https://your-endpoint.salad.cloud/v1
  apiKey: dummy
  defaultCompletionOptions:
    maxTokens: 512
  requestOptions:
    headers:
      Salad-Api-Key: your-salad-api-key

Model Recommendations

Qwen 3.6-35B-A3B: Best for chat, codebase Q&A, and multi-file edits
Qwen 3.5-9B: Fast enough for autocomplete; good balance of latency and quality

Documentation Index

​Introduction

​Prerequisites

​Step-by-Step Setup

​Step 1: Choose Your Backend

​Step 2: Install Continue

​Step 3: Configure Continue to Use Your SaladCloud Endpoint

​Step 4: Test the Connection

​Tips for Best Results

​Set Context Window

​Handle Autocomplete Timeouts

​Model Recommendations