Skills catalog

← index

colibri-skills is Colibri's read-only runtime consumer for Clawdie-AI skill

artifacts. Clawdie-AI authors and reviews the skillpacks; Colibri indexes

them, validates checksums, chunks searchable text, and exposes typed structs to

the daemon, CLI, and TUI. This crate does not author skills.

→ crates/colibri-skills/src/lib.rs

→ docs/COLIBRI-SKILLS-PLAN.md

Decisions

Source of truth stays in Clawdie-AI

Skill artifacts live in the clawdie-ai repository, not in colibri. They are

committed reviewed directories containing prose, screenshots, transcripts,

scripts, a manifest, and a checksum file. colibri-skills imports these

artifacts into Colibri's SQLite store at runtime.

This split preserves review discipline: a skill changes through a PR in its

home repo, then Colibri re-indexes the checkout.

Read-only, not authoring

The crate deliberately lacks "create skill" or "edit skill" operations. Those

belong in Clawdie-AI where human review and media pipelines run. Putting

authoring here would duplicate state and split review authority.

The import path is target for Phase 1: scan the configured Clawdie-AI checkout,

parse manifests, verify checksums, and upsert into SQLite. The type scaffold

exists today; the importer, chunker, and FTS5 index are planned.

→ docs/COLIBRI-SKILLS-PLAN.md (Phases 1-7)

Manifest-driven identity

Each skill directory contains a run manifest file. From it the importer derives:

skill_id
display_name
source_path within the Clawdie-AI checkout
pipeline stages and models used
source media metadata

Any file not listed in the manifest can still be classified and indexed as an

artifact, but the manifest is the canonical identity document.

Artifact classification by extension and filename

ArtifactType::from_path classifies files without relying on a sidecar:

Python or shell files → Script
paths containing contact_sheet → ContactSheet
paths containing run_manifest and ending in .json → Manifest
paths containing sha256 or checksum → Checksum
paths containing report and ending in .json → Report
.md → Document
.jpg / .png / .webp → Image
.txt transcript files → Transcript
anything else → Other

This heuristic keeps classification local and fast. Misclassified files can be

fixed by renaming within Clawdie-AI.

→ crates/colibri-skills/src/lib.rs (ArtifactType::from_path)

Checksums are validated, then stored

The run manifest is accompanied by a checksum file. At import time the runtime

computes SHA-256 of each artifact and compares it to the committed checksum.

Failures are reported in ImportSummary::checksum_failures and prevent

success().

Only the hash is stored in SQLite; image and media blobs stay on disk. The

catalog stores relative paths and hashes, not the binary content.

Content is chunked into searchable units

The planned chunker turns skill content into SkillChunk rows:

Markdown sections by heading
Command blocks
Code blocks
Tables
Transcript segments

Chunks are the unit of search and the unit shown in TUI or CLI results.

SkillChunk carries line_start/line_end so a hit can point back to the

source artifact.

→ crates/colibri-skills/src/lib.rs (SkillChunk, ChunkType)

SQLite + FTS5 as the runtime search backend

The target schema keeps three tables:

system_skills — one row per skill
system_skill_artifacts — one row per file
system_skill_chunks — one row per searchable chunk, plus a virtual FTS5

table for ranked text search

This matches the store's pragmatic relational model. If skill volumes grow

beyond tens of thousands of chunks, we can move the FTS index to PostgreSQL

pgvector; until then, SQLite keeps the control-plane self-contained.

→ store-schema

→ docs/COLIBRI-SKILLS-PLAN.md (SQLite schema target)

Status is a lifecycle marker, not a state machine

SkillStatus is active, archived, or superseded. There is no pending

review state because review happens in Clawdie-AI before import. Colibri simply

stops returning archived skills in default searches but keeps them in the store

for audit and explicit lookups.

Natural-language verification question

Each skill can carry a verification field like "can the user create and run

an Astro project?". This is not an executable test; it is the acceptance

criterion used during skill review and later during agent self-verification.

Runtime commands are read-only

The CLI surface is planned as:

colibri list-skills
colibri show-skill
colibri search-skills
colibri index-skills
colibri verify-skill

index-skills refreshes the catalog from disk. The remaining commands query the

runtime store. None mutate the Clawdie-AI checkout.

→ operator-cli

Entity shape

Skill ├─ skill_id, display_name, source_path, status, verification ├─ SkillManifest │ ├─ run_id, created, notes │ ├─ ManifestSource │ ├─ [PipelineStage] │ └─ [ModelUsage] └─ [SkillArtifact] ├─ artifact_type, relative_path, file_name, mime_type, size_bytes, sha256_hash └─ [SkillChunk]

├─ chunk_type, heading, content, line_start, line_end, tokens_estimate

Skills catalog

Skills catalog

Decisions

Source of truth stays in Clawdie-AI

Read-only, not authoring

Manifest-driven identity

Artifact classification by extension and filename

Checksums are validated, then stored

Content is chunked into searchable units

SQLite + FTS5 as the runtime search backend

Status is a lifecycle marker, not a state machine

Natural-language verification question

Runtime commands are read-only

Entity shape

See also