

Data Scientist (AI/ML Architect)
β - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Scientist (AI/ML Architect) with a contract length of over 6 months, offering a pay rate of "pay rate." It requires expertise in Python, deep learning frameworks, and distributed systems, with experience in MLOps and model monitoring. Location is hybrid in Jersey City, NJ, or Atlanta, GA.
π - Country
United States
π± - Currency
$ USD
-
π° - Day rate
-
ποΈ - Date discovered
June 10, 2025
π - Project duration
More than 6 months
-
ποΈ - Location type
Hybrid
-
π - Contract type
Unknown
-
π - Security clearance
Unknown
-
π - Location detailed
Atlanta, GA
-
π§ - Skills detailed
#AI (Artificial Intelligence) #Monitoring #Storage #Spark (Apache Spark) #Data Pipeline #PyTorch #Debugging #Scala #Data Orchestration #AWS (Amazon Web Services) #HDFS (Hadoop Distributed File System) #MLflow #SciPy #Leadership #Python #Reinforcement Learning #ML (Machine Learning) #Databases #Kubernetes #Cloud #Data Science #Deep Learning #Docker #DevOps #GCP (Google Cloud Platform) #Airflow #Azure #TensorFlow #Delta Lake #NumPy #Libraries #Deployment #Computer Science #Data Engineering #Data Storage #Apache Airflow #S3 (Amazon Simple Storage Service) #Observability #Distributed Computing
Role description
Heading 1
Heading 2
Heading 3
Heading 4
Heading 5
Heading 6
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Block quote
Ordered list
- Item 1
- Item 2
- Item 3
Unordered list
- Item A
- Item B
- Item C
Bold text
Emphasis
Superscript
Subscript
Job Title: Data Scientist (AI/ML Architect) β Distributed Systems & MLOps
Location: Jersey city, NJ/ Atlanta, GA (Hybrid/ Remote)
Job Type: Full-Time | [Contract / Permanent]
β’ Architect and implement distributed training strategies utilizing frameworks like Horovod or DeepSpeed.
β’ Deploy and manage ML models using containerization (Docker, Kubernetes) and serving frameworks (TensorFlow Serving, TorchServe, Seldon Core).
β’ Implement robust model monitoring and drift detection systems.
β’ Leverage MLOps best practices for CI/CD of ML pipelines and models.
β’ Profile and optimize model performance for low-latency inference.
β’ Integrate with various data storage solutions (e.g., distributed file systems, vector databases).
β’ Contribute to the development of internal AI/ML infrastructure and tooling.
β’ Troubleshoot and debug complex distributed AI/ML systems.
Key Skills:
β’ Deep understanding of Machine Learning paradigms (Supervised, Unsupervised, Deep Learning, Reinforcement Learning).
β’ Expertise in Python and relevant scientific computing libraries (NumPy, SciPy).
β’ Proficient in deep learning frameworks (TensorFlow, PyTorch) and their ecosystems.
β’ Strong experience with data pipeline orchestration tools (Airflow, Kubeflow).
β’ Expertise in feature engineering platforms (Feast, Tecton).
β’ Solid understanding of distributed computing concepts and frameworks (Spark, Dask).
β’ Experience with containerization and orchestration (Docker, Kubernetes).
β’ Knowledge of ML model serving frameworks (TF Serving, TorchServe, Seldon Core).
β’ Familiarity with model monitoring and drift detection techniques.
β’ Strong understanding of data serialization and storage formats (e.g., Parquet, Avro, Protocol Buffers).
Job Summary:
We are seeking a highly experienced AI/ML Architect to lead the design and implementation of scalable, distributed, and production-ready ML systems. The ideal candidate will possess deep expertise in machine learning, deep learning, MLOps, and distributed computing, with hands-on experience in both research and engineering. You will architect and build the infrastructure required to train, deploy, and monitor high-performance ML models at scale.
Key Responsibilities:
β’ Architect and implement distributed training strategies using frameworks like Horovod, DeepSpeed, or Ray Train.
β’ Design, deploy, and manage ML model serving with frameworks such as TensorFlow Serving, TorchServe, and Seldon Core.
β’ Utilize Docker and Kubernetes for containerization and orchestration of scalable AI workloads.
β’ Establish CI/CD pipelines for ML systems using MLOps practices and tools (e.g., Kubeflow, MLflow, Airflow).
β’ Implement model monitoring, data drift detection, and performance alerting systems in production.
β’ Optimize model inference pipelines for low-latency, high-throughput serving, including model quantization or hardware-specific tuning.
β’ Integrate with distributed data systems (e.g., S3, HDFS, Delta Lake) and vector databases (e.g., Pinecone, FAISS, Weaviate).
β’ Build and extend internal AI/ML tooling and platforms for experimentation and reproducibility.
β’ Debug and resolve complex issues in distributed AI/ML workflows, both at the infrastructure and algorithmic level.
Required Skills & Qualifications:
β’ 8+ years of experience in Machine Learning Engineering or AI Infrastructure roles.
β’ Deep understanding of Supervised, Unsupervised, Deep Learning, and Reinforcement Learning paradigms.
β’ Expert proficiency in Python and scientific computing libraries (e.g., NumPy, SciPy).
β’ Hands-on experience with TensorFlow, PyTorch, and associated ecosystems.
β’ Strong experience with distributed training frameworks (Horovod, DeepSpeed, Ray, Dask).
β’ Proficient in containerization and orchestration tools (Docker, Kubernetes).
β’ Familiarity with ML model serving (TF Serving, TorchServe, Seldon Core).
β’ Solid experience with data orchestration tools such as Apache Airflow, Kubeflow Pipelines.
β’ Understanding of feature stores (e.g., Feast, Tecton).
β’ Knowledge of serialization formats like Parquet, Avro, Protocol Buffers.
β’ Strong analytical and debugging skills for AI/ML pipeline optimization.
Preferred Qualifications:
β’ Masterβs or PhD in Computer Science, Machine Learning, or related field.
β’ Experience with LLMs, Gen-AI, or multimodal models.
β’ Familiarity with multi-cloud AI deployments (AWS/GCP/Azure).
β’ Exposure to AIOps, observability tools, and real-time feature pipelines.
β’ Knowledge of GPU acceleration, multi-node training, and inference optimization (ONNX, TensorRT).
Soft Skills:
β’ Strong leadership and architectural thinking
β’ Excellent collaboration with cross-functional teams (data engineering, DevOps, product)
β’ Effective communicator, both technically and strategically