Profile

Backend Systems &
Memory-Centric AI Developer

Building real-world systems that prioritize correctness, stability, and long-term maintainability. Debian VPS infrastructure, local LLMs, and inference-time reasoning.

Focus: |
Scroll

System Overview

About, architecture, and metrics in one place

I'm a backend-focused developer and community college student specializing in memory-centric AI systems, scalable backend architecture, and high-performance data processing. My work centers on building real-world systems that prioritize correctness, stability, and long-term maintainability over demos or hype.

I manage my own Debian-based VPS infrastructure, host local LLMs and APIs, and design systems that learn through persistent memory and inference-time reasoning, not retraining.

I also have experience working with IBM i systems (CL, RPG, RDi) and long-running production services.

Backend Systems
AI Systems
Infrastructure
Debian VPS

Featured Projects

Production systems and long-running services

Trinix AI

TRXV3 — Local-First AI Assistant

A reasoning-first AI assistant that combines memory, tools, planning, and vision behind a single chat interface. Use it from web UI or CLI; runs on local Ollama only or add cloud models via API keys.

  • Vector memory: remember what you tell it
  • Web search with optional approve/deny
  • Tools: GPU/CPU/disk, files, scripts
  • Image analysis (upload or reference)
  • Generate and run code with fix-and-retry
  • Transparent reasoning: live “thought process” in the UI
  • Uncertainty-aware: confidence scores trigger optional planning

Core stack: API — Flask (/api/chat, JSON or multipart, SSE). Engine — trinix.assistant (routing, memory/vision/research/tool/create_file or full chat pipeline). Router — regex intent: chat, memory_store, memory_recall, vision, research, create_file, tool. Memory & KB — embeddings (e.g. qwen3-embedding:8b) + SQLite. Models — Ollama (local + optional cloud); model router by task. Persistence — single SQLite: messages, memories, embeddings, knowledge base, logs.

One line: TRXV3 is a reasoning-first AI assistant that remembers, uses tools, does web research and image analysis, and shows its thought process; it can run fully local or with cloud models and gives you approve/deny over research and planning.

FlaskOllamaSQLitePython
Private

Memory-Centric Inference-Time Learning System

A shared, teachable AI system designed to learn new domains without retraining. Learning occurs through persistent memory ingestion, selective retrieval, and inference-time reasoning.

  • Inference-Time Learning (no fine-tuning)
  • Global Persistent Memory with attribution
  • Multi-person conversations, topic switching
  • Low hallucination via filtered retrieval

Short-Term / Long-Term Memory separation, memory scoring and decay, belief updating without context overwrite. Large local LLM as reasoning engine. Runs on self-managed Debian VPS.

PythonLLMDebianVPS
Private

High-Performance Search Backend

Backend search system for professionals requiring instant access to extremely large datasets. Built for investigative and legal workflows where speed and accuracy are critical.

  • Search across 3.9+ billion records
  • Sub-second query performance
  • Advanced PostgreSQL indexing
  • Smart caching layer

PostgreSQL, FastAPI/Flask, Docker, systemd, Debian Linux.

PostgreSQLFastAPIDockersystemd
Discord

Trinix: Moderation & Automation

Long-running (4+ years) modular moderation and automation system for Discord. Locally hosted LLMs for real-time analysis, server-specific customization, auditability.

  • LLM-powered moderation (S1–S13 classification)
  • Warn, mute, delete, kick, blacklist logic
  • 30+ event listeners, custom economy (stock market)
  • 89+ commands, guild settings, logging

Python, Pycord, Flask, SQLite, aiohttp, Ollama. Async DB, modular Cog-based architecture.

PythonPycordOllamaSQLite

Skills & Technologies

Backend & Systems

  • PostgreSQL, SQLite
  • Flask, FastAPI
  • Docker, systemd
  • Debian-based VPS hosting
  • Performance tuning & monitoring

AI & Learning Systems

  • Memory-centric AI architectures
  • Inference-time learning
  • LLM orchestration (Ollama, LM Studio)
  • Hallucination reduction strategies
  • Persistent memory design

Languages & Platforms

  • Python
  • JavaScript, HTML, CSS
  • C#
  • IBM i: RPG, CL, RDi

Infrastructure

  • Linux server administration (Debian 12)
  • VPN routing
  • Self-hosted APIs and services
  • Model hosting and orchestration

Infrastructure & Stack

Technologies used in production

Contact

Open to collaboration and interesting backend & AI systems work. Reach out via GitHub or email.

Some projects include proprietary components. Contact for licensing or demo access.