Backend Systems & Memory-Centric AI Developer

System Overview

About, architecture, and metrics in one place

I'm a backend-focused developer and community college student specializing in memory-centric AI systems, scalable backend architecture, and high-performance data processing. My work centers on building real-world systems that prioritize correctness, stability, and long-term maintainability over demos or hype.

I manage my own Debian-based VPS infrastructure, host local LLMs and APIs, and design systems that learn through persistent memory and inference-time reasoning, not retraining.

I also have experience working with IBM i systems (CL, RPG, RDi) and long-running production services.

Backend Systems

AI Systems

Infrastructure

Debian VPS

This demo shows how a user query moves through the TRXV3 architecture — from input, to memory retrieval, to reasoning, to final output.

Live flow: how a query moves through TRXAI

1. User

Step 1 of 12

User

User sends a natural language query or instruction.

"What did we discuss about deployment last week?"

Final output

Featured Projects

Production systems and long-running services

Trinix AI

TRXV3 — Local-First AI Assistant

A reasoning-first AI assistant that combines memory, tools, planning, and vision behind a single chat interface. Use it from web UI or CLI; runs on local Ollama only or add cloud models via API keys.

Vector memory: remember what you tell it
Web search with optional approve/deny
Tools: GPU/CPU/disk, files, scripts
Image analysis (upload or reference)
Generate and run code with fix-and-retry
Transparent reasoning: live “thought process” in the UI
Uncertainty-aware: confidence scores trigger optional planning

Core stack: API — Flask (/api/chat, JSON or multipart, SSE). Engine — trinix.assistant (routing, memory/vision/research/tool/create_file or full chat pipeline). Router — regex intent: chat, memory_store, memory_recall, vision, research, create_file, tool. Memory & KB — embeddings (e.g. qwen3-embedding:8b) + SQLite. Models — Ollama (local + optional cloud); model router by task. Persistence — single SQLite: messages, memories, embeddings, knowledge base, logs.

One line: TRXV3 is a reasoning-first AI assistant that remembers, uses tools, does web research and image analysis, and shows its thought process; it can run fully local or with cloud models and gives you approve/deny over research and planning.

FlaskOllamaSQLitePython

Private

Memory-Centric Inference-Time Learning System

A shared, teachable AI system designed to learn new domains without retraining. Learning occurs through persistent memory ingestion, selective retrieval, and inference-time reasoning.

Inference-Time Learning (no fine-tuning)
Global Persistent Memory with attribution
Multi-person conversations, topic switching
Low hallucination via filtered retrieval

Short-Term / Long-Term Memory separation, memory scoring and decay, belief updating without context overwrite. Large local LLM as reasoning engine. Runs on self-managed Debian VPS.

PythonLLMDebianVPS

Private

High-Performance Search Backend

Backend search system for professionals requiring instant access to extremely large datasets. Built for investigative and legal workflows where speed and accuracy are critical.

Search across 3.9+ billion records
Sub-second query performance
Advanced PostgreSQL indexing
Smart caching layer

PostgreSQL, FastAPI/Flask, Docker, systemd, Debian Linux.

PostgreSQLFastAPIDockersystemd

Discord

Trinix: Moderation & Automation

Long-running (4+ years) modular moderation and automation system for Discord. Locally hosted LLMs for real-time analysis, server-specific customization, auditability.