TukiCode is an open-source autonomous AI coding agent that lives in your terminal (CLI). It is a software development tool, not an educational platform. Developers use TukiCode to write code, execute shell commands, refactor projects, and complete programming tasks autonomously using AI models like Claude, Gemini, GPT, or local Ollama models.

Is TukiCode an educational platform?

No. TukiCode (tukicode.site) is not an educational platform. It is an autonomous AI software agent designed for professional developers and programmers. It runs locally via Ollama or in the cloud via OpenRouter, Anthropic, or Google Gemini.

How do I install TukiCode?

On macOS or Linux, run: curl -fsSL https://tukicode.site/api/install.sh | bash. On Windows, run in PowerShell: irm tukicode.site/api/install.ps1 | iex. After installation, launch the agent with: tuki chat

What AI models does TukiCode support?

TukiCode supports Anthropic Claude, Google Gemini, OpenAI GPT models via OpenRouter, and fully local AI models via Ollama. It features a ReAct reasoning loop, parallel tool execution, and prompt caching for maximum efficiency.

Documentation

TukiCode Docs

The local-first AI terminal assistant for Windows developers. Learn how to install, configure, and master TukiCode.

Getting Started

TukiCode runs on Windows, macOS, and Linux. Install in seconds with a single command.

Installation

Windows — PowerShell 5.1+

PowerShell

PS > iwr https://tukicode.site/api/install.ps1 | iex

macOS & Linux — bash / zsh

Terminal

$ curl -fsSL https://tukicode.site/api/install.sh | bash

Installs tuki to ~/.local/bin and updates your shell profile. On macOS, if the binary is blocked on first run: xattr -d com.apple.quarantine ~/.local/bin/tuki

Configuration

After installation, you need to configure which AI model TukiCode will use. Run the following command to start the setup wizard:

tuki — setup

$ tuki config --setup

This will launch an interactive configuration wizard where you can:

Select your AI provider: Choose between Ollama, OpenRouter, Anthropic, or Google (Gemini).
Enter the model name: Type the specific model identifier (e.g., tencent/hy3-preview:free).
Provide your API key: If you selected a cloud provider, paste your API key securely. TukiCode encrypts and stores this key locally.
Configure Advanced Settings: (Optional) Adjust the context window size and max tokens limit depending on your chosen model's capabilities.

For OpenRouter, you can use a single API key for all supported models, including free ones.

TukiCode is designed to work with models that support native tool-calling. This is required for proper agent behavior, including file operations, command execution, and multi-step reasoning.

Recommended free models (with tool-calling support):

tencent/hy3-preview:free (recommended)
moonshotai/kimi-k2.5
zhipu/glm-4.6
deepseek/deepseek-chat-v3.2

We strongly recommend using: tencent/hy3-preview:free for the best balance between performance, reliability, and tool execution in free-tier environments.

Quick Start

After installation and configuration, launch TukiCode from any terminal:

bash

$ tuki chat

Working Modes

TukiCode adapts to your workflow through three distinct operating modes. Each one routes your input through a different internal pipeline, giving you full control over autonomy and speed.

Mode Comparison

Mode	Pipeline	Confirmation	Best For
Chat	AgentLoop directly	Per tool (based on autonomy)	Questions, quick fixes, explanations
Plan	Planner → show steps → confirm → Executor	Always asks before executing	Multi-step tasks you want to review first
Build	Planner → show steps → Executor	No confirmation — runs immediately	Full project scaffolding, autonomous builds

Chat Mode (Default)

The standard conversational interface. Your message goes directly into the ReAct loop: the agent thinks, calls tools, observes results, and repeats until it has a complete answer. No structured plan is created — the model decides each action on the fly based on what it discovers.

Best for: asking questions, getting code explanations, or doing quick isolated tasks.

Entry point: AgentLoop.run_turn()
Planning: None (model reasons internally via <thinking> tags)
Confirmation: Controlled by your /autonomy level

Plan Mode

When you have a complex task, Plan Mode uses a dedicated Planner module to break it into a precise, numbered list of atomic steps before any code is written. You review and approve the plan, and only then does the Executor run each step.

Example: "Refactor the authentication module to use JWT."

Your message is sent to Planner.generate_plan(), which calls the LLM and receives a structured JSON array of steps.
The plan is displayed in full in the chat.
The agent pauses and asks: "Do you want to execute this plan? (y/n)"
If confirmed, Executor.execute_plan() runs each step sequentially via the AgentLoop, with automatic retries and model fallback on failure.

(Demonstration of Plan Mode creating and executing steps)

Build Mode

Build Mode is designed for autonomous, end-to-end project generation. It follows the same Planner pipeline as Plan Mode, but skips the confirmation step — the agent shows you the plan and begins executing immediately.

It also supports plan resumption: if a previous build was interrupted, Build Mode picks up from the last pending step automatically.

If there are pending steps in planner_state.json (from a previous session), execution resumes from where it left off.
If there is no existing plan, Planner.generate_plan() is called automatically. The generated plan is shown in the chat before execution starts.
Executor.execute_plan() runs immediately — no confirmation is asked.

Build vs Plan: Use Plan when you want to review and approve the strategy before anything runs. Use Build when you trust the agent and want to go as fast as possible — you still see the plan, but execution starts right away.

Core Features

Understand the key capabilities that make TukiCode unique.

Three-Tier Autonomy System

TukiCode gives you total control over the agent's decision-making process. Adjust the autonomy level with /autonomy:

Low — Maximum Safety

The agent is "on a leash." It pauses and asks for your explicit confirmation for every single action—reading files, searching, or applying fixes.

Medium — Balanced

The agent can explore your files and list directories autonomously to gather context. It stops and waits for approval before writing code, deleting files, or executing shell commands.

High — Unstoppable Speed

Say "Yes" once per turn. After your first approval, TukiCode executes all necessary steps in its reasoning loop autonomously until it reaches the final solution.

Interactive Project Explorer

TukiCode includes a live directory tree in the left panel of the terminal interface:

Visual Context: Always see exactly where you are in your project structure.
Smart Selection: Clicking any file in the tree instantly tells the agent to read and analyze it—the fastest way to feed context without typing paths.

Non-Touch Git Policy

Your project's history belongs to you. TukiCode is hard-coded to ignore any .git related commands. It will never initialize, commit, or push changes. This ensures your version control remains clean and strictly manual.

AI Models & Hardware Requirements

A detailed, honest guide to choosing the right AI models and understanding the hardware required to run TukiCode.

Recommended Models

Ollama (Local) — No limits, no costs, offline

Ollama is the ideal choice for TukiCode because it has no rate limits, works entirely offline, guarantees total privacy, and has zero cost per token.

Model	Required RAM	Required VRAM	Estimated Speed
`qwen2.5-coder:7b`	8GB RAM	6GB VRAM	~50-70 tok/s on RTX 3060
`qwen2.5-coder:14b`	16GB RAM	10GB VRAM	~40-60 tok/s on RTX 3060
`qwen2.5-coder:32b`	32GB RAM	20GB VRAM	~20-30 tok/s on RTX 3060

Installation commands:

ollama pull qwen2.5-coder:7b
ollama pull qwen2.5-coder:14b
ollama pull qwen2.5-coder:32b

OpenRouter (Cloud) — Requires internet & API key

OpenRouter allows you to use cloud models, including free and paid options. The free models have important limitations you should understand before using them.

Recommended free models for TukiCode:

tencent/hy3-preview:free — Best for agents, supports native tool calling
qwen/qwen3-coder-480b:free — Most powerful available for free, 262K context
deepseek/deepseek-r1:free — Excellent reasoning for complex tasks
meta-llama/llama-3.3-70b-instruct:free — Stable and reliable for simple tasks

Important limitations of OpenRouter free models:

Rate limits: 20 requests per minute and 200 requests per day max.
Frequent timeouts: During peak hours (US time), models can take over 60 seconds to respond, causing timeout errors.
Variable availability: Free models can be saturated or go offline without notice.
Inconsistent quality: During high demand, the quality of responses may degrade.
Not recommended for: Long tasks like full project migrations, generating multiple large files, or intensive work sessions.

TukiCode features an automatic fallback system between free models, but this does not completely eliminate interruptions.

Recommended paid models via OpenRouter (guaranteed high quality):

anthropic/claude-sonnet-4-5 — $3/1M input tokens, $15/1M output tokens
anthropic/claude-haiku-3-5 — $0.80/1M input tokens, $4/1M output tokens (best value)
openai/gpt-4o — $2.50/1M input tokens, $10/1M output tokens

For reference: an intensive 30-minute session with TukiCode consumes approximately 10,000-40,000 tokens depending on the task. With claude-haiku-3-5, that equals less than $0.20 USD.

Hardware Requirements

Using TukiCode with Ollama (local)

Recommended Minimum:

CPU: 8 modern cores (Intel i7 10th gen+ / AMD Ryzen 5 5000+)
RAM: 16GB
GPU: NVIDIA RTX 3060 (12GB VRAM) or equivalent
Storage: 20GB free space for models

Optimal:

CPU: Intel i9 / AMD Ryzen 9 or Apple Silicon
RAM: 32GB+
GPU: NVIDIA RTX 3080+ / RTX 4070+ or Apple M1/M2/M3
Storage: NVMe SSD with 50GB free space

Apple Silicon (M1/M2/M3/M4):

Special mention — Apple Silicon's unified memory is ideal for Ollama. An M1 with 16GB can run qwen2.5-coder:14b at ~30-40 tokens/second. An M3 with 16GB is even better. It is probably the best value option to use TukiCode with Ollama.

Verified Compatible GPUs:

NVIDIA RTX 3060 (12GB) [YES] — fluid with qwen2.5-coder:14b
NVIDIA RTX 2060 (8GB) [YES] — fluid with qwen2.5-coder:7b
NVIDIA RTX 4070 (12GB) [YES] — fluid with qwen2.5-coder:32b
AMD RX 6700 XT (12GB) [YES] — compatible with ROCm
Intel Arc A770 (16GB) [WARNING] — experimental support
NVIDIA GTX 1080 (8GB) [WARNING] — works but slow
GPUs with less than 6GB VRAM [NO] — not recommended for Ollama

My PC doesn't have a dedicated GPU?

If your machine does not have a dedicated GPU or has less than 6GB VRAM, Ollama will run on CPU and will be very slow (2-5 tokens/second). In that case, OpenRouter with a paid model like claude-haiku-3-5 is the most practical option — fast, reliable, and almost zero cost.

Using TukiCode with OpenRouter (cloud)

Any computer with internet connection
No GPU required
Minimum RAM: 4GB (to run TukiCode itself)
Python 3.10+

Which model should I use?

Situation	Recommendation
I have a GPU with 6GB+ VRAM	Ollama with `qwen2.5-coder:7b` or `14b`
I have a Mac with Apple Silicon	Ollama with `qwen2.5-coder:14b`
I don't have a GPU but want free options	OpenRouter free models (with limitations)
I want the best possible experience	OpenRouter `claude-haiku-3-5` (< $0.20/session)
I work in an enterprise / need strict privacy	Local Ollama is mandatory
I'm a student with a basic PC	OpenRouter free + patience during peak hours

Commands in chat

TukiCode uses a slash-command system to manage your AI experience.

`/model` — Intelligence Switching

The most powerful command in TukiCode. Switch between AI models:

Without arguments: Opens a semi-transparent modal to select from Ollama (local), Gemini, Claude, or OpenRouter models.
Smart Detection: Type /model gemini-1.5-pro and the agent automatically configures itself.
Auto-Config: If a cloud model requires an API key you haven't provided, TukiCode pops up a secure input box to save it permanently.
History: TukiCode remembers every model you use, keeping your favorites at the top of the list.

tuki — /model

$ /model gemini-1.5-pro [api-key]

`/autonomy` — Control the Agent

Toggle how much TukiCode asks for permission. Use /autonomy high for quick iterations or /autonomy low when you want to review every single line the agent reads or writes.

`/risk` — Sensitivity Management

Adjust how the agent classifies the "danger" of its tools:

Low Risk: Standard operations.
Medium/High Risk: System-altering commands. TukiCode uses this setting to determine when to trigger security warnings based on your current autonomy level.

`/copy` — Export Your Code

Found the perfect solution? Type /copy to instantly copy the last code block generated by the AI to your clipboard. Paste directly into VS Code, IntelliJ, or any editor.

`/history` — Project Continuity

Shows a list of your most recent sessions. Helps you keep track of what you've worked on and maintain context across different parts of your development cycle.

`/clear` & `/exit`

/clear: Wipes the current terminal log for a distraction-free environment.
/exit: Saves your current settings and safely shuts down the agent.

Architecture

Understand how TukiCode thinks and processes your requests.

Asynchronous Architecture

TukiCode operates on a native asynchronous engine (httpx and asyncio). This ensures the terminal remains incredibly responsive and non-blocking, even while performing heavy background LLM generation or executing large shell processes concurrently.

ReAct Logic: How TukiCode "Thinks"

Unlike simple chatbots, TukiCode follows a technical ReAct logic:

1. Thinking

The agent analyzes your request and writes down its internal reasoning.

2. Planning

It creates a step-by-step roadmap of which files to read and which tools to use.

TukiCode Docs

Getting Started

Installation

Configuration

Quick Start

Working Modes

Mode Comparison

Chat Mode (Default)

Plan Mode

Build Mode

Core Features

Three-Tier Autonomy System

Low — Maximum Safety

Medium — Balanced

High — Unstoppable Speed

Interactive Project Explorer

Non-Touch Git Policy

AI Models & Hardware Requirements

Recommended Models

Ollama (Local) — No limits, no costs, offline

OpenRouter (Cloud) — Requires internet & API key

Hardware Requirements

Using TukiCode with Ollama (local)

Using TukiCode with OpenRouter (cloud)

Which model should I use?

Commands in chat

`/model` — Intelligence Switching

`/autonomy` — Control the Agent

`/risk` — Sensitivity Management

`/copy` — Export Your Code

`/history` — Project Continuity

`/clear` & `/exit`

Architecture

Asynchronous Architecture

ReAct Logic: How TukiCode "Thinks"

1. Thinking

2. Planning

3. Execution

4. Final Response

MVC Architecture

Getting Started

Installation

Configuration

Quick Start

Working Modes

Mode Comparison

Chat Mode (Default)

Plan Mode

Build Mode

Core Features

Three-Tier Autonomy System

Low — Maximum Safety

Medium — Balanced

High — Unstoppable Speed

Interactive Project Explorer

Non-Touch Git Policy

AI Models & Hardware Requirements

Recommended Models

Ollama (Local) — No limits, no costs, offline

OpenRouter (Cloud) — Requires internet & API key

Hardware Requirements

Using TukiCode with Ollama (local)

Using TukiCode with OpenRouter (cloud)

Which model should I use?

Commands in chat

/model — Intelligence Switching

/autonomy — Control the Agent

/risk — Sensitivity Management

/copy — Export Your Code

/history — Project Continuity

/clear & /exit

Architecture

Asynchronous Architecture

ReAct Logic: How TukiCode "Thinks"

1. Thinking

2. Planning

3. Execution

4. Final Response

MVC Architecture

`/model` — Intelligence Switching

`/autonomy` — Control the Agent

`/risk` — Sensitivity Management

`/copy` — Export Your Code

`/history` — Project Continuity

`/clear` & `/exit`