Jobs via Dice

Sr Machine Learning Engineer - Synthetic Data & Document Understanding

⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Sr Machine Learning Engineer specializing in Synthetic Data & Document Understanding, located onsite in Austin, Texas. The contract is over 6 months, with a focus on generative modeling, data quality evaluation, and strong Python skills.
🌎 - Country
United States
💱 - Currency
$ USD
-
💰 - Day rate
Unknown
-
🗓️ - Date
June 12, 2026
🕒 - Duration
More than 6 months
-
🏝️ - Location
On-site
-
📄 - Contract
Unknown
-
🔒 - Security
Unknown
-
📍 - Location detailed
Austin, TX
-
🧠 - Skills detailed
#Stories #Computer Science #Generative Models #Mathematics #Storage #Data Pipeline #ML (Machine Learning) #Leadership #Programming #Scala #Data Modeling #AI (Artificial Intelligence) #PyTorch #Cloud #Python #Data Quality
Role description
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Apex Systems, is seeking the following. Apply via Dice today! Job#: 3036756 Job Description: Sr Machine Learning Engineer - Synthetic Data & Document Understanding Location: Austin, Texas (Onsite) Role Overview We are seeking a Senior Machine Learning Engineer for Synthetic Data & Document Understanding to own the synthetic data generation track within a specialized Document AI Data team. This role focuses on building generative pipelines that produce high-quality, diverse, and realistic synthetic training data at scale. The position is suited for an engineer with deep generative modeling expertise combined with skills in data quality evaluation and production engineering. Key Responsibilities • Design and implement pipelines that analyze real documents to inform high-fidelity synthetic data generation. • Build generative systems capable of producing documents across diverse formats, layouts, and domains. • Develop evaluation frameworks to ensure synthetic data maintains distributional fidelity and diversity. • Research and apply generative modeling techniques suited for document AI training. • Identify and mitigate quality issues to ensure synthetic data is effective for downstream model training. • Partner with modeling teams to measure the impact of synthetic data on model performance. • Own the synthetic data generation track end-to-end, from architecture to quality validation. • Drive architectural decisions balancing quality, diversity, scale, and cost efficiency. • Define and maintain data quality metrics and generation dashboards. • Build scalable pipelines capable of generating millions of synthetic training examples. • Implement post-processing, filtering, and validation mechanisms to remove low-quality outputs. • Collaborate with platform teams on compute orchestration, storage, and scheduling. Required Qualifications Education & Experience: • MS or PhD in Computer Science, Engineering, Mathematics, or a related field. • 5+ years of experience in Machine Learning / AI, with a focus on generative models, Vision-Language Models (VLMs), and synthetic data systems. • Proven experience building and evaluating synthetic data pipelines for ML training. • Strong background in data quality evaluation and statistical analysis. Technical Expertise: • Deep expertise in Vision-Language Models and document understanding (layout, structure, semantics). • Strong knowledge of generative modeling for structured and semi-structured data. • Understanding of what makes synthetic data valuable, including distributional fidelity, diversity, realistic noise patterns, and domain coverage. • Strong programming skills in Python with experience in PyTorch or similar frameworks. • Experience evaluating data quality via automated metrics and downstream model impact. • Familiarity with large-scale data pipelines, cloud environments, and experiment tracking. Leadership & Communication: • Proven ability to independently own complex technical workstreams. • Strong collaboration across data, modeling, and platform teams. • Ability to clearly communicate data quality and generation trade-offs. • A data-driven mindset with attention to coverage gaps and quality signals. Everforth Apex is a world-class IT services company that serves thousands of clients across the globe. When you join Everforth Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Our commitment to excellence is reflected in many awards, including ClearlyRateds Best of Staffing in Talent Satisfaction in the United States and Great Place to Work in the United Kingdom and Mexico. Everforth Apex uses a virtual recruiter as part of the application process. Click for more details. By applying for this job, you agree to receive calls, AI-generated calls, text messages, or emails from Everforth Apex and its affiliates, and contracted partners. Frequency varies for text messages. Message and data rates may apply. Carriers are not liable for delayed or undelivered messages. You can reply STOP to cancel and HELP for help. You can access our privacy policy at Everforth Apex Benefits Overview: Everforth Apex offers a range of supplemental benefits, including medical, dental, vision, life, disability, and other insurance plans that offer an optional layer of financial protection. We offer an ESPP (employee stock purchase program) and a 401K program which allows you to contribute typically within 30 days of starting, with a company match after 12 months of tenure. Everforth Apex also offers a HSA (Health Savings Account on the HDHP plan), a SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions, a corporate discount savings program and other discounts. In terms of professional development, Everforth Apex hosts an on-demand training program, provides access to certification prep and a library of technical and leadership courses/books/seminars once you have 6+ months of tenure, and certification discounts and other perks to associations that include CompTIA and IIBA. Everforth Apex has a dedicated customer service team for our Consultants that can address questions around benefits and other resources, as well as a certified Career Coach. You can access a full list of our benefits, programs, support teams and resources within our 'Welcome Packet' as well, which an Everforth Apex team member can provide. Everforth Apex Systems is an equal opportunity employer. We do not discriminate or allow discrimination on the basis of race, color, religion, creed, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), age, sexual orientation, gender identity, national origin, ancestry, citizenship, genetic information, registered domestic partner status, marital status, disability, status as a crime victim, protected veteran status, political affiliation, union membership, or any other characteristic protected by law. Everforth Apex will consider qualified applicants with criminal histories in a manner consistent with the requirements of applicable law. If you require an accommodation under the Americans with Disabilities Act to participate in an interview with a virtual recruiter or to use our website for a search or application, please contact our Benefits Department at or . Please note that this contact information is strictly to be used for medical ADA accommodations and that no other inquiries will be answered. UnitedHealthcare creates and publishes the Transparency in Coverage Machine-Readable Files on behalf of Everforth Apex Systems.