Yochana

Metrics Analyst

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Metrics Analyst on a remote contract, requiring expert-level VictoriaMetrics experience, substantial Red Hat OpenShift knowledge, and proficiency in GitOps. Key skills include time series lifecycle management and telecommunications observability platform design.
🌎 - Country
United States
πŸ’± - Currency
$ USD
-
πŸ’° - Day rate
Unknown
-
πŸ—“οΈ - Date
June 24, 2026
πŸ•’ - Duration
Unknown
-
🏝️ - Location
Remote
-
πŸ“„ - Contract
Unknown
-
πŸ”’ - Security
Unknown
-
πŸ“ - Location detailed
United States
-
🧠 - Skills detailed
#"ETL (Extract #Transform #Load)" #Replication #GitLab #Time Series #Deployment #Storage #Monitoring #Security #Base #Cloud #Vault #GIT #Prometheus #Terraform #Strategy #Observability #Kubernetes
Role description
Job Title: Victoria Metrics Architect Location: Remote Type : Contract Required Qualifications: VictoriaMetrics β€” Expert Level Candidates should have hands-on Victoria Metrics production experience at scale for this role. β€’ Significant production experience operating VictoriaMetrics at scale β€” VMCluster deployments handling sustained, high-cardinality workloads in live environments. This is the non-negotiable baseline for the role. β€’ VMCluster internals at depth: the write path from VMInsert through VMStorage replication, the query fan-out and merge behavior of VMSelect, and the performance implications of topology decisions on ingestion throughput and query latency. β€’ Active time series lifecycle management: how time series are created, sustained, and expired; the relationship between cardinality and memory pressure; and the ability to diagnose and remediate a cardinality explosion in a production environment. β€’ MetricsQL fluency: advanced aggregation, rollup window semantics, subquery patterns, and query design that reduces load on VMStorage at scale. β€’ VMAgent at depth: scrape configuration, stream aggregation for edge-side cardinality reduction, rate limiting, deduplication, and write buffering continuity during upstream unavailability. β€’ VMAuth multi-tenancy: per-tenant routing via VMUser custom resources, token-based authentication, and read/write path segregation. β€’ VMAlert and VMAnomaly: alerting and recording rule design, anomaly model selection, and integration with enterprise alert dispatch systems. β€’ Federation design: global query layer architecture, cross-cluster deduplication, and remote\_write performance tuning under high-cardinality ingestion at sustained scale. β€’ Storage architecture: retention modelling, down sampling, backup and restore, and capacity planning for time-series workloads. β€’ VictoriaMetrics Operator: lifecycle management of all VM custom resource definitions and upgrade strategy on OpenShift. Red Hat OpenShift β€” Production Depth β€’ Substantial Kubernetes experience with a material portion on Red Hat OpenShift in bare-metal or on-premises enterprise environments β€” not exclusively managed cloud Kubernetes. β€’ OpenShift security model: Security Context Constraints, Network Policy, namespace RBAC, and the constraints that apply to stateful, high-throughput workloads. β€’ StatefulSet lifecycle, PersistentVolumeClaim management, and StorageClass selection for write-intensive time-series workloads. β€’ OCP upgrade path management and the implications for Operator compatibility and cluster monitoring interactions. β€’ Multi-cluster OpenShift topology: hub and spoke architectures, cross-cluster networking, and remote scrape or remote\_write connectivity across cluster boundaries. β€’ Comfort designing for IPv6 and dual-stack network environments β€” increasingly common in carrier-grade infrastructure deployments. GitOps and CI/CD Delivery β€’ GitOps-native delivery as a professional standard: all platform configuration managed in Git, no manual changes to production cluster state, and a clear promotion gate model from lab through to production. β€’ ArgoCD at production scale: application hierarchy design, sync policy configuration, health checks for custom resources, and multi-cluster application deployment. β€’ Kustomize overlay strategy for multi-cluster and multi-tenant deployments β€” base definitions with environment-specific patches. β€’ GitLab CI/CD pipeline design: manifest validation, environment promotion gates, and automated operator upgrade pipelines. β€’ Terraform or equivalent infrastructure-as-code for provisioning supporting platform resources. Security and Identity β€’ HashiCorp Vault at production depth: dynamic secrets, Vault Secrets Operator synchronisation, token lifecycle management, and PKI secrets engine integration for certificate issuance. β€’ Enterprise PKI: TLS certificate lifecycle, automated renewal, and CA distribution to distributed cluster workloads. β€’ OIDC and OAuth2 integration: platform service authentication via an enterprise identity provider, service account token federation, and the elimination of static credential patterns. β€’ Zero Trust design as a default: every interface between platform components authenticated and encrypted; no implicit trust between tenants, ingestion sources, or query consumers. Telecommunications and Network Observability β€’ Proven experience designing or operating observability platforms for telecommunications infrastructure β€” 5G core, RAN, transport, or carrier-grade edge environments. β€’ FCAPS framework alignment: mapping Fault, Configuration, Accounting, Performance, and Security monitoring requirements to metric taxonomies, alerting rules, and operational dashboards. β€’ Heterogeneous vendor telemetry integration: Prometheus exporter compatibility assessment, OpenMetrics format validation, and labelling standardization across multi-vendor sources. β€’ Multi-vendor, multi-tenant metrics ingestion design: label isolation strategy, per-vendor cardinality allocation, and data segregation enforcement at the proxy and routing layer. β€’ Enterprise NOC integration: alert routing design from evaluation engine through to ticketing or event management platforms, deduplication, suppression, and severity mapping.