Introduction

Infrastructure teams waste hours answering repetitive questions like “How do I restart a Kubernetes pod?” or “Why is my database connection failing?” They also spend time manually creating JIRA tickets, copying details, and assigning them to the right team.

This blog shows how to build an AI-powered Slack bot that solves both problems: it searches a knowledge base using RAG (Retrieval-Augmented Generation) to answer questions instantly, or automatically creates JIRA tickets with smart team routing when it can’t find an answer. All running locally with Ollama—no cloud LLM costs.

How It Works

The bot follows a simple decision flow: search knowledge base first, create ticket if needed.

Workflow Diagram

Components:

  • Slack Bot - Captures questions via /infra-inquiry command
  • Ollama - Local LLM (llama3.1:8b) for AI reasoning
  • ChromaDB - Vector database storing knowledge base
  • PostgreSQL - Tracks all inquiries and metrics
  • JIRA - Auto-creates tickets with team assignment

Using the Slack Command

When a user needs help, they type /infra-inquiry followed by their question. A modal appears asking for environment (PROD/STG/DEV) and deadline.

Output:

Slack Command Modal

The modal captures context that helps the bot classify urgency and route tickets appropriately. RAG (Retrieval-Augmented Generation) combines AI with a searchable knowledge base to provide accurate answers without hallucinations.

The bot converts questions into vector embeddings (numerical representations) and searches ChromaDB for semantically similar Q&As. If confidence is high (>60%), it returns the answer immediately.

How it searches:

def search_knowledge(question):
    # Convert question to vector embedding
    embedding = create_embedding(question)
    
    # Search ChromaDB for similar Q&As
    results = chromadb.search(embedding, top_k=3)
    
    # Check confidence threshold
    if results.distance < 0.4:  # High confidence match
        return synthesize_answer(results)
    
    return None  # Not found, create ticket instead

What happens: The bot searches its knowledge base first. If it finds a relevant answer with high confidence, it responds instantly. Otherwise, it creates a ticket.

Output:

Real Inquiry Response

This reduces ticket volume significantly—common questions get answered in seconds without human intervention.

Smart Team Routing

When the bot can’t answer from the knowledge base, it creates a JIRA ticket and routes it to the appropriate team. It uses a two-tier approach: fast keyword matching, then AI classification.

Routing logic:

def route_to_team(question):
    # Fast keyword matching
    if 'kubernetes' in question or 'pod' in question:
        return 'platform'
    if 'database' in question or 'postgres' in question:
        return 'database'
    
    # Fallback to AI classification using LLM
    team = llm.classify(
        question, 
        teams=['platform', 'devops', 'database', 'security', 'network']
    )
    return team

What happens: The router checks for obvious keywords first (fast). If no match, it asks the LLM to classify the question based on team responsibilities.

Output:

JIRA Ticket Created

The ticket includes auto-generated summary, environment context, urgency classification, and team labels. The assigned team gets notified immediately.

JIRA Ticket Details:

JIRA Ticket Details

The ticket appears in JIRA with all fields properly populated: question, environment, deadline, urgency, category, and assigned team.

Architecture

The system uses three AI agents built with LangChain, each handling a specific task:

1. Supervisor Agent

  • Orchestrates the entire workflow
  • Classifies urgency (low/medium/high/critical)
  • Decides: answer from KB or create ticket

2. Knowledge Agent

  • Performs RAG search in vector database
  • Filters by relevance threshold (strict <0.4 for high confidence)
  • Synthesizes answer from retrieved documents

3. Router Agent

  • Assigns tickets to correct team
  • Uses keyword matching + LLM fallback
  • Routes to: platform, devops, database, security, or network teams

Technology stack:

  • Ollama - Local LLM deployment (llama3.1:8b for reasoning, nomic-embed-text for embeddings)
  • ChromaDB - Vector database for semantic search
  • PostgreSQL - Inquiry tracking and metrics
  • Redis - Caches search results (reduces latency)

Metrics Dashboard

All inquiries are tracked in PostgreSQL with full metadata: question, environment, urgency, team assignment, KB resolution status, and JIRA ticket ID.

Output:

PostgreSQL Metrics Dashboard

Key metrics:

  • KB hit rate - % of questions answered without human intervention
  • Team distribution - Workload balance across teams
  • Category breakdown - Most common inquiry types (kubernetes, database, network, etc.)

Access via /infra-metrics command in Slack or run python metrics.py for detailed reports.

Quick Setup

1. Install Ollama and models

ollama pull llama3.1:8b
ollama pull nomic-embed-text

2. Start Docker services

docker-compose up -d

This starts PostgreSQL, Redis, and ChromaDB containers.

3. Configure environment

Create .env file with Slack tokens, JIRA credentials, and database settings.

4. Run the bot

python src/main.py

Bot connects to Slack and starts listening for /infra-inquiry commands.

Expected output:

Bot Startup Logs

You’ll see all components initialize: Ollama connection, Redis cache, PostgreSQL database, ChromaDB vector store, AI agents (Supervisor, Knowledge, Router), and Slack bot in Socket Mode.

Conclusion

This bot reduces manual ticket creation and provides instant answers from a searchable knowledge base. All inquiries are tracked in PostgreSQL for continuous improvement of the knowledge base and team workload analysis.


Get the Full Source Code

📧 Enter your email to access the complete repository

Free • No spam • Unsubscribe anytime