Coding
PromptBeginner5 minmarkdown
Markdown Converter
Agent skill for markdown-converter
7
Goal
Sign in to like and favorite skills
LLM Skill (Integrated, RAG‑First)
Goal
Backends
K3D_LLM_MODEL=/path/to/model, runs CPU/GPU depending on availability. Suggested small chat models: TinyLlama‑1.1B‑Chat..gguf via llama-cpp-python. Good for EXAONE‑Deep 2.4B/7.8B Q4_K_M.Commands (in chat)
/llm backend transformers <model> — set local HF model path or repo id./llm backend llama_cpp <model.gguf> [n_gpu_layers] [n_ctx] — load GGUF directly (no servers). Example: /llm backend llama_cpp /models/EXAONE-Deep-7.8B.Q4_K_M.gguf 30 2048/llm ask <text> — direct LLM generation./llm rag <text> [k] — RAG: retrieves top‑k labels via TF‑IDF from the current House and injects as context to the LLM.RAG Source
dataset_snippets (label+text). Future: richer multimodal grounding with CLIP/CLAP.Notes
pip install llama-cpp-python (CPU). For CUDA, follow upstream build instructions. On RTX 3060 12 GB, 7.8B Q4_K_M works with partial GPU offload (e.g., n_gpu_layers≈30) and ctx≈2048.