Big Data AI Engineer

Singapore, Singapore, Singapore
Full Time
Entry Level

AI Engineer - Big Data

Employment Type: Full-Time
Location: Remote, Singapore
Level: Entry to Mid Level (PhD Required)

Bridge Cutting-Edge AI Research with Petabyte-Scale Data Systems

About the Role

Work at the intersection of big data and AI, where you'll develop intelligent, self-healing data systems processing trillions of data points daily. You'll have autonomy to pursue research in distributed ML systems and AI-enhanced data optimization, with your innovations deployed at unprecedented scale within months, not years.

This isn't traditional data engineering - you'll implement agentic AI for autonomous pipeline management, leverage LLMs for data quality assurance, and create ML-optimized architectures that redefine what's possible at petabyte scale.

Key Research Areas & Responsibilities

AI-Enhanced Data Infrastructure

  • Design intelligent pipelines with autonomous optimization and self-healing capabilities using agentic AI
  • Implement ML-driven anomaly detection for terabyte-scale datasets

Distributed Machine Learning at Scale

  • Build distributed ML pipelines
  • Develop real-time feature stores for billions of transactions
  • Optimize feature engineering with AutoML and neural architecture search

Required Qualifications

Education & Research

  • PhD in Computer Science, Data Science, or Distributed Systems (exceptional Master's with research experience considered)
  • Published research or expertise in distributed computing, ML infrastructure, or stream processing

Technical Expertise

  • Core Languages: Expert SQL (window functions, CTEs), Python (Pandas, Polars, PyArrow), Scala/Java
  • Big Data Stack: Spark 3.5+, Flink, Kafka, Ray, Dask
  • Storage & Orchestration: Delta Lake, Iceberg, Airflow, Dagster, Temporal
  • Cloud Platforms: GCP (BigQuery, Dataflow, Vertex AI), AWS (EMR, SageMaker), Azure (Databricks)
  • ML Systems: MLflow, Kubeflow, Feature Stores, Vector Databases, scikit-learn + search CV, H2O AutoML, auto-sklearn, GCP Vertex AI AutoML Tables
  • Neural Architecture Search: KerasTuner, AutoKeras, Ray Tune, Optuna, PyTorch Lightning + Hydra

Research Skills

  • Track record with 100TB+ datasets
  • Experience with lakehouse architectures, streaming ML, and graph processing at scale
  • Understanding of distributed systems theory and ML algorithm implementation

Preferred Qualifications

  • Experience applying LLMs to data engineering challenges
  • Ability to translate complex AutoML/NAS research into practical production workflows
  • Hands-on project examples of feature engineering automation or NAS experiments
  • Proven success in automating ML pipelines, from raw data to an optimized model architecture
  • Contributions to Apache projects (Spark, Flink, Kafka)
  • Knowledge of privacy-preserving techniques and data mesh architectures

What Makes This Role Unique

You'll work with one of the few truly petabyte-scale production datasets outside of major tech companies, with the freedom to experiment with cutting-edge approaches. Unlike traditional big data roles, you'll apply the latest AI research to fundamental data challenges - from using LLMs to understand data quality issues to implementing agentic systems that autonomously optimize and heal data pipelines.

About us

Pixalate is an online trust and safety platform that protects businesses, consumers and children from deceptive, fraudulent and non-compliant mobile, CTV apps and websites.

We're seeking a PhD-level AI Engineer to lead cutting-edge research in agentic AI systems, multimodal analysis, and advanced reasoning architectures that will directly impact millions of users worldwide. Our software and data have been used to unearth multiple high profile criminal and illegal surveillance cases including:

Our team of lawyers, data scientists, engineers, economists and researchers span globally with presence in California, New York, Washington DC, London and Singapore.Pixalate is an equal opportunity employer committed to building a diverse team.  

Benefits 

At Pixalate, we offer an extremely competitive salary, outstanding benefits, and a dynamic work environment. You will have the opportunity to work on pioneering technologies alongside some of the brightest minds in the industry. If you're passionate about maintaining high software quality and thrive in a fast-paced, challenging environment, you'll fit right in.
  • Monthly internet reimbursement
  • Casual, remote work environment
  • Hybrid, flexible hours
  • Opportunity for advancement
  • Fun annual team events
  • Being part of a high performing team that wants to win and have fun doing it
We particularly encourage applications from underrepresented groups in AI research.

#LI-MW1
Share

Apply for this position

Required*
We've received your resume. Click here to update it.
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*