PAVEL PERMINOV
Staff / Principal Engineer
| Helsinki, Finland | start@perminov.im | GitHub |
Summary
Staff / Principal Engineer focused on AI platform architecture, LLM runtimes, and production-grade observability. 17+ years building scalable distributed systems and internal platforms across AWS, GCP, and Azure. Specialized in LLM systems, developer platforms, and observability, delivering measurable improvements in reliability, cost, and engineering velocity. Progressed from infrastructure engineering to leading AI platform architecture and cross-system design.
Scope & Ownership
- Owned platform architecture spanning backend services, infrastructure, and developer workflows.
- Influenced cross-team engineering practices through standardization of observability, CI/CD, and security baselines.
- Acted as escalation point for complex production incidents and architecture trade-offs.
- Drove implementation patterns adopted across environments and teams.
Skills
Core
- Architecture: Distributed Systems, Platform Architecture, AI/LLM Systems
- Platform: Kubernetes, Terraform
- Languages: Go, Python, TypeScript
- Observability: OpenTelemetry, Prometheus, Loki, Tempo, Grafana, Datadog
- Cloud: AWS, GCP, Azure, Edge
- CI/CD & Platform Delivery: Azure DevOps, GitLab, ArgoCD
Additional
- FastAPI, Node.js, React
- PostgreSQL, MySQL, MongoDB, Redis
- Docker, Helm, Nginx, HAProxy
- NATS, RabbitMQ, Elasticsearch
- Ansible, Vulnerability Management
Leadership and Business
- Mentored engineers and improved onboarding through repeatable platform workflows and environment automation, reducing environment setup from ~1 day to ~1-2 hours (~70%) in infrastructure-heavy projects.
- Improved delivery predictability by standardizing CI/CD and release practices across backend, UI, and infrastructure, including automated pipelines used across multiple production paths.
- Reduced incident response and troubleshooting time through standardized observability and diagnostics, including response-time reductions from hours to 1 minute and troubleshooting-time reductions up to 90%.
- Contributed to hiring and technical evaluation, strengthening team capability and quality of execution.
- Technical leadership experience with teams up to 16 people, plus entrepreneurial and CTO-level responsibilities.
Experience
Consulting roles leading platform and AI system architecture across multiple engagements.
Senior AI Software Engineering Consultant at Unikie
January 2026 - Present (4 months)
Helsinki
- Owned end-to-end AI platform architecture covering runtime, data pipelines, and observability.
- Defined architecture for a full-stack platform (FastAPI, React, WebSockets, PostgreSQL, Terraform), balancing scalability, developer velocity, and operational cost under production constraints.
Reduced environment setup time by ~60%. - Designed and implemented a tool-enabled LLM runtime with schema-validated tool calls and structured tracing.
Reduced debugging effort by ~40% and improved platform observability. - Implemented RAG pipelines and vector indexing with incremental sync.
Improved data freshness and reduced unnecessary reprocessing. - Established security baselines (session hardening, CORS allowlists, rate limiting, brute-force protection).
Strengthened production security posture and reduced abuse risk. - Standardized observability with OTel, Prometheus, Loki, Tempo, and Grafana.
Reduced troubleshooting time by up to 90%. - Led CI/CD automation across backend, UI, and infrastructure using Azure DevOps and containerized deployments.
Improved release consistency and end-to-end delivery speed.
Senior Infrastructure Engineer at Unikie
December 2025 - December 2025 (1 month)
Helsinki
- Built end-to-end Linux development environment provisioning with Ansible.
Reduced setup time from ~1 day to ~1-2 hours (~70%). - Standardized infrastructure automation for OS setup, networking, and developer tooling.
Reduced onboarding friction and improved repeatability. - Defined configuration-as-code practices across systems.
Minimized configuration drift and improved environment consistency. - Developed KVM/libvirt lab automation for VM lifecycle and networking.
Improved reprovisioning and recovery speed. - Implemented secure firmware flashing workflows with validation and dry-run support.
Reduced operator error risk during sensitive operations. - Established SSH and network hardening baselines plus telemetry integration (Telegraf, InfluxDB, TLS).
Improved security posture and accelerated diagnostics.
Senior Software Engineering Consultant at Unikie
May 2025 - November 2025 (7 months)
Helsinki, Uusimaa, Finland
- Defined architecture for a gateway aggregating multiple services under unified protocols.
Simplified service integration and reduced integration complexity. - Built unit and integration test suites as a quality baseline for releases.
Improved system reliability and release confidence. - Built a vulnerability detection system and custom endpoint rate limiter.
Strengthened proactive security controls and platform stability under load. - Standardized observability through Datadog across services.
Improved visibility into performance and operational behavior. - Acted as cross-layer escalation engineer across infrastructure, DevOps, and applications.
Reduced time to resolution for complex architectural and production issues.
Senior Full-Stack Engineer at Bolt.works
September 2023 - January 2024 (5 months)
- Designed and integrated third-party APIs, reducing manager decision-making time from 10-24 hours to 1-2 minutes.
- Designed and implemented a developer workflow tool that eliminated terminal window switching.
- Increased integration test coverage by 3%, improving GitLab CI/CD reliability.
- Evaluated and optimized approaches to reduce software development and delivery costs.
- Developed and optimized software for AWS infrastructure.
Senior Software/DevOps/Infrastructure Engineer at Millisecond oy
April 2021 - August 2023 (2 years, 4 months)
- Led platform architecture for mining operations and IoT/fleet systems.
Replaced paper-based workflows and delivered real-time operational visibility. - Defined cross-cloud infrastructure strategy (AWS, GCP, Azure, edge) using Kubernetes, Terraform, and Ansible.
Balanced cost, performance, and operational complexity. Achieved ~99% environment parity and reduced production defects. - Established CI/CD, Git, deployment, and security standards across teams.
Standardized delivery approach across teams and achieved 100% process automation with 99.9% uptime. - Built real-time monitoring and incident notification capabilities.
Reduced incident response from hours to 1 minute. - Developed a truck management platform with predictive maintenance tracking.
Reduced operational costs by over 10%.
Senior Infrastructure Developer at Ericsson
June 2020 - April 2021 (11 months)
- Designed and implemented a distributed alert-handling system with dynamic event processing and conflict resolution, ensuring reliable assignment and 100% guaranteed alerting under distributed load.
- Delivered the platform under strict encryption and security requirements for telecom environments.
- Managed integration of open-source software into proprietary systems, streamlining development and enhancing functionality.
Infrastructure Engineer at Curious AI
January 2019 - May 2020 (1 year, 5 months)
- Developed a version control system for AI training on Kubernetes, enabling restarts from any point, improving stability by 50%, and ensuring reproducibility.
- Developed Kubernetes tools and plugins, including a CSI plugin, Ingress controller, and configuration manager, enabling full infrastructure control.
- Migrated legacy platforms to Kubernetes, standardizing deployments and reducing complexity.
- Built AWS infrastructure (EKS, CloudFormation, and related services), ensuring 99.9% service availability and scalability.
- Implemented Jenkins, significantly simplifying application development and delivery processes.
- Supported developers in day-to-day infrastructure and DevOps related tasks.
Developer Experience Engineer at Tochka Bank
June 2018 - December 2018 (7 months)
- Developed a highly available cloud-native load balancer for InfluxDB with resynchronization and error correction.
- Built command-line tools for authorization and database subscription management.
- Identified and fixed Linux kernel bugs.
- Supported developers in day-to-day infrastructure and DevOps related tasks.
Senior Systems/DevOps/Software Engineer at TPlus group
August 2008 - June 2018 (10 years)
- Built a low-cost backup platform in Bash, Python, and PHP, covering 100% of branches with rapid hourly recovery.
- Designed and operated data center, storage, and virtualization foundations that provided full IT service coverage and informed later platform engineering practices.
Key Systems & Platforms
- AI Tool Runtime Platform: Enabled structured tool execution, tracing, and reliable RAG workflows.
- Observability Platform: Standardized metrics, logs, and tracing across services to enable fast diagnostics and incident response.
- Platform Automation: CI/CD and infrastructure provisioning enabling consistent, reproducible multi-environment deployments.
- IoT and Fleet Systems: Delivered real-time visibility and predictive operations in distributed production environments.
Languages
- English – Fluent
- Russian – Native
- Swedish – B1
- Finnish – A1