Technical Documentation
System Architecture
Deep dive into Allma Studio's technical implementation, from the React frontend to the FastAPI backend and local LLM integration via Ollama.
System Layers
Layered Architecture
A clean separation of concerns with five distinct layers
Frontend Layer
React 18 SPAVite BuildTailwindCSSAxios HTTP
API Gateway
FastAPIUvicorn ASGICORS ProtectionRate Limiting
Orchestration
Request RouterRAG ServiceConversation ServiceDocument Service
Data Layer
ChromaDB VectorsSQLite SessionsNomic EmbeddingsDocument Store
LLM Layer
Ollama RuntimeDeepSeek R1Gemma 2Qwen 2.5 Coder
Request Lifecycle
Data Flow Pipeline
User Query
Natural language input
API Gateway
Request validation
Orchestrator
Route to services
Vector Search
Find relevant docs
LLM Processing
Generate response
Streaming
Token-by-token output
Technology Stack
Built With Modern Tools
Frontend
React18.3.1
UI FrameworkVite5.2.11
Build ToolTailwindCSS3.4.3
StylingAxios1.7.2
HTTP ClientReact Markdown9.0.1
Markdown RenderingLucide React0.378.0
IconsBackend
Python3.11+
RuntimeFastAPI0.115.0
Web FrameworkUvicorn0.31.1
ASGI ServerSQLAlchemy2.0.36
ORMChromaDB0.5.17
Vector DBhttpx0.28.1
Async HTTPAI/ML
OllamaLatest
Local LLM RuntimeNomic EmbedText
Embeddings ModelDeepSeek R15.2GB
Reasoning LLMGemma 29B
General LLMQwen 2.5Coder
Code LLMLLaMA 3.22GB
Fast LLMInfrastructure
DockerLatest
ContainerizationDocker Composev2
OrchestrationKubernetesv1.28+
Container OrchestrationHelmv3
K8s Package ManagerGitHub ActionsCI/CD
AutomationVercelEdge
Frontend HostingSecurity & Privacy
Security First Design
Zero Telemetry
No data collection whatsoever
CORS Protection
Configurable cross-origin security
Rate Limiting
Built-in API throttling
Non-root Containers
Security-hardened Docker images