Documentation

TukiCode Docs

The local-first AI terminal assistant for Windows developers. Learn how to install, configure, and master TukiCode.

Getting Started

TukiCode runs on Windows, macOS, and Linux. Install in seconds with a single command.

Installation

Windows — PowerShell 5.1+

PowerShell
PS > iwr https://tukicode.site/api/install.ps1 | iex

macOS & Linux — bash / zsh

Terminal
$ curl -fsSL https://tukicode.site/api/install.sh | bash

Installs tuki to ~/.local/bin and updates your shell profile. On macOS, if the binary is blocked on first run: xattr -d com.apple.quarantine ~/.local/bin/tuki

Configuration

After installation, you need to configure which AI model TukiCode will use. Run the following command to start the setup wizard:

tuki — setup
$ tuki config --setup

This will launch an interactive configuration wizard where you can:

  1. Select your AI provider: Choose between Ollama, OpenRouter, Anthropic, or Google (Gemini).
  2. Enter the model name: Type the specific model identifier (e.g., tencent/hy3-preview:free).
  3. Provide your API key: If you selected a cloud provider, paste your API key securely. TukiCode encrypts and stores this key locally.
  4. Configure Advanced Settings: (Optional) Adjust the context window size and max tokens limit depending on your chosen model's capabilities.

For OpenRouter, you can use a single API key for all supported models, including free ones.

TukiCode is designed to work with models that support native tool-calling. This is required for proper agent behavior, including file operations, command execution, and multi-step reasoning.

Recommended free models (with tool-calling support):

  • tencent/hy3-preview:free (recommended)
  • moonshotai/kimi-k2.5
  • zhipu/glm-4.6
  • deepseek/deepseek-chat-v3.2

We strongly recommend using: tencent/hy3-preview:free for the best balance between performance, reliability, and tool execution in free-tier environments.

Quick Start

After installation and configuration, launch TukiCode from any terminal:

bash
$ tuki chat

Working Modes

TukiCode adapts to your workflow through three distinct operating modes. Each one routes your input through a different internal pipeline, giving you full control over autonomy and speed.

Mode Comparison

Mode Pipeline Confirmation Best For
Chat AgentLoop directly Per tool (based on autonomy) Questions, quick fixes, explanations
Plan Planner → show steps → confirm → Executor Always asks before executing Multi-step tasks you want to review first
Build Planner → show steps → Executor No confirmation — runs immediately Full project scaffolding, autonomous builds

Chat Mode (Default)

The standard conversational interface. Your message goes directly into the ReAct loop: the agent thinks, calls tools, observes results, and repeats until it has a complete answer. No structured plan is created — the model decides each action on the fly based on what it discovers.

Best for: asking questions, getting code explanations, or doing quick isolated tasks.

  • Entry point: AgentLoop.run_turn()
  • Planning: None (model reasons internally via <thinking> tags)
  • Confirmation: Controlled by your /autonomy level

Plan Mode

When you have a complex task, Plan Mode uses a dedicated Planner module to break it into a precise, numbered list of atomic steps before any code is written. You review and approve the plan, and only then does the Executor run each step.

Example: "Refactor the authentication module to use JWT."

  1. Your message is sent to Planner.generate_plan(), which calls the LLM and receives a structured JSON array of steps.
  2. The plan is displayed in full in the chat.
  3. The agent pauses and asks: "Do you want to execute this plan? (y/n)"
  4. If confirmed, Executor.execute_plan() runs each step sequentially via the AgentLoop, with automatic retries and model fallback on failure.

(Demonstration of Plan Mode creating and executing steps)

Build Mode

Build Mode is designed for autonomous, end-to-end project generation. It follows the same Planner pipeline as Plan Mode, but skips the confirmation step — the agent shows you the plan and begins executing immediately.

It also supports plan resumption: if a previous build was interrupted, Build Mode picks up from the last pending step automatically.

  1. If there are pending steps in planner_state.json (from a previous session), execution resumes from where it left off.
  2. If there is no existing plan, Planner.generate_plan() is called automatically. The generated plan is shown in the chat before execution starts.
  3. Executor.execute_plan() runs immediately — no confirmation is asked.

Build vs Plan: Use Plan when you want to review and approve the strategy before anything runs. Use Build when you trust the agent and want to go as fast as possible — you still see the plan, but execution starts right away.

Core Features

Understand the key capabilities that make TukiCode unique.

Three-Tier Autonomy System

TukiCode gives you total control over the agent's decision-making process. Adjust the autonomy level with /autonomy:

Low — Maximum Safety

The agent is "on a leash." It pauses and asks for your explicit confirmation for every single action—reading files, searching, or applying fixes.

Medium — Balanced

The agent can explore your files and list directories autonomously to gather context. It stops and waits for approval before writing code, deleting files, or executing shell commands.

High — Unstoppable Speed

Say "Yes" once per turn. After your first approval, TukiCode executes all necessary steps in its reasoning loop autonomously until it reaches the final solution.

Interactive Project Explorer

TukiCode includes a live directory tree in the left panel of the terminal interface:

  • Visual Context: Always see exactly where you are in your project structure.
  • Smart Selection: Clicking any file in the tree instantly tells the agent to read and analyze it—the fastest way to feed context without typing paths.

Non-Touch Git Policy

Your project's history belongs to you. TukiCode is hard-coded to ignore any .git related commands. It will never initialize, commit, or push changes. This ensures your version control remains clean and strictly manual.

AI Models & Hardware Requirements

A detailed, honest guide to choosing the right AI models and understanding the hardware required to run TukiCode.

Hardware Requirements

Using TukiCode with Ollama (local)

Recommended Minimum:

  • CPU: 8 modern cores (Intel i7 10th gen+ / AMD Ryzen 5 5000+)
  • RAM: 16GB
  • GPU: NVIDIA RTX 3060 (12GB VRAM) or equivalent
  • Storage: 20GB free space for models

Optimal:

  • CPU: Intel i9 / AMD Ryzen 9 or Apple Silicon
  • RAM: 32GB+
  • GPU: NVIDIA RTX 3080+ / RTX 4070+ or Apple M1/M2/M3
  • Storage: NVMe SSD with 50GB free space

Apple Silicon (M1/M2/M3/M4):

Special mention — Apple Silicon's unified memory is ideal for Ollama. An M1 with 16GB can run qwen2.5-coder:14b at ~30-40 tokens/second. An M3 with 16GB is even better. It is probably the best value option to use TukiCode with Ollama.

Verified Compatible GPUs:

  • NVIDIA RTX 3060 (12GB) [YES] — fluid with qwen2.5-coder:14b
  • NVIDIA RTX 2060 (8GB) [YES] — fluid with qwen2.5-coder:7b
  • NVIDIA RTX 4070 (12GB) [YES] — fluid with qwen2.5-coder:32b
  • AMD RX 6700 XT (12GB) [YES] — compatible with ROCm
  • Intel Arc A770 (16GB) [WARNING] — experimental support
  • NVIDIA GTX 1080 (8GB) [WARNING] — works but slow
  • GPUs with less than 6GB VRAM [NO] — not recommended for Ollama

My PC doesn't have a dedicated GPU?

If your machine does not have a dedicated GPU or has less than 6GB VRAM, Ollama will run on CPU and will be very slow (2-5 tokens/second). In that case, OpenRouter with a paid model like claude-haiku-3-5 is the most practical option — fast, reliable, and almost zero cost.

Using TukiCode with OpenRouter (cloud)

  • Any computer with internet connection
  • No GPU required
  • Minimum RAM: 4GB (to run TukiCode itself)
  • Python 3.10+

Which model should I use?

Situation Recommendation
I have a GPU with 6GB+ VRAM Ollama with qwen2.5-coder:7b or 14b
I have a Mac with Apple Silicon Ollama with qwen2.5-coder:14b
I don't have a GPU but want free options OpenRouter free models (with limitations)
I want the best possible experience OpenRouter claude-haiku-3-5 (< $0.20/session)
I work in an enterprise / need strict privacy Local Ollama is mandatory
I'm a student with a basic PC OpenRouter free + patience during peak hours

Commands in chat

TukiCode uses a slash-command system to manage your AI experience.

/model — Intelligence Switching

The most powerful command in TukiCode. Switch between AI models:

  • Without arguments: Opens a semi-transparent modal to select from Ollama (local), Gemini, Claude, or OpenRouter models.
  • Smart Detection: Type /model gemini-1.5-pro and the agent automatically configures itself.
  • Auto-Config: If a cloud model requires an API key you haven't provided, TukiCode pops up a secure input box to save it permanently.
  • History: TukiCode remembers every model you use, keeping your favorites at the top of the list.
tuki — /model
$ /model gemini-1.5-pro [api-key]

/autonomy — Control the Agent

Toggle how much TukiCode asks for permission. Use /autonomy high for quick iterations or /autonomy low when you want to review every single line the agent reads or writes.

/risk — Sensitivity Management

Adjust how the agent classifies the "danger" of its tools:

  • Low Risk: Standard operations.
  • Medium/High Risk: System-altering commands. TukiCode uses this setting to determine when to trigger security warnings based on your current autonomy level.

/copy — Export Your Code

Found the perfect solution? Type /copy to instantly copy the last code block generated by the AI to your clipboard. Paste directly into VS Code, IntelliJ, or any editor.

/history — Project Continuity

Shows a list of your most recent sessions. Helps you keep track of what you've worked on and maintain context across different parts of your development cycle.

/clear & /exit

  • /clear: Wipes the current terminal log for a distraction-free environment.
  • /exit: Saves your current settings and safely shuts down the agent.

Architecture

Understand how TukiCode thinks and processes your requests.

Asynchronous Architecture

TukiCode operates on a native asynchronous engine (httpx and asyncio). This ensures the terminal remains incredibly responsive and non-blocking, even while performing heavy background LLM generation or executing large shell processes concurrently.

ReAct Logic: How TukiCode "Thinks"

Unlike simple chatbots, TukiCode follows a technical ReAct logic:

1. Thinking

The agent analyzes your request and writes down its internal reasoning.

2. Planning

It creates a step-by-step roadmap of which files to read and which tools to use.

3. Execution

It acts upon your files locally, showing real-time progress in the terminal.

4. Final Response

Once the task is done, it provides a concise summary and the final result.

MVC Architecture

TukiCode operates on a native asynchronous engine (httpx and asyncio). This ensures the terminal remains incredibly responsive and non-blocking, even while performing heavy background LLM generation or executing large shell processes concurrently. More details TukiCode GitHub.