AI Systems Engineer
We are hiring an AI Systems Engineer to design, build, and deploy end-to-end AI systems across the organisation—ranging from client-facing AI products to internal tools supporting data science, product, engineering, and revenue teams—on top of robust, scalable AWS infrastructure.
This is a hands-on role spanning AI system design, agent architectures, LLM engineering, cloud deployment, and AIOps/MLOps for production reliability. You will work on LLM applications, agentic workflows, RAG systems, ML pipelines, analytics automation, and microservices.
Key Responsibilities
• Design, build, and operate end-to-end AI/LLM systems, including chatbots, analytics assistants, automation tools, and decision-support services.
• Develop internal productivity and intelligence tools that accelerate workflows across data science, product, engineering, and revenue teams.
• Build autonomous AI agents and workflow orchestrators using frameworks such as LangChain, CrewAI, ADK, or equivalent systems.
• Design and implement LLM-backed microservices (FastAPI/Flask) for summarisation, intelligence, forecasting, data extraction, and API-driven reasoning.
• Build and operate full Retrieval-Augmented Generation (RAG) pipelines: ingestion → chunking → embeddings → indexing → retrieval → LLM reasoning.
• Optimise retrieval quality using metadata, hybrid search, chunking strategies, rerankers, and relevance tuning.
• Implement document classification, NER, entity extraction, and knowledge-graph-driven retrieval where appropriate.
• Establish reliability, safety, and governance guardrails across AI systems, including monitoring, error handling, tool-selection controls, and risk mitigation.
• Instrument, monitor, and evaluate AI and RAG systems using logging, metrics, tracing, agent telemetry, quality benchmarks, hallucination testing, and regression tests.
• Deploy and operate AI agents and LLM microservices on AWS (Bedrock, Lambda, ECS/EKS, API Gateway, S3, Secrets Manager, CloudWatch).
• Build and maintain production CI/CD pipelines (GitHub Actions), manage model/version lifecycles, and support retraining and automated evaluation workflows.
Required Skills & Qualifications
Key Skills
Strong software engineering background with expertise in Python, including modular design, async programming, and modern development practices.
Experience designing and building APIs and microservices (FastAPI / Flask) for production systems.
Hands-on experience building and operating production LLM systems, including agentic workflows and RAG pipelines.
Experience designing and operating RAG systems, including vector databases and retrieval pipelines.
Experience designing and running LLM evaluations, including task-level metrics, hallucination testing, regression benchmarks, and golden datasets.
Hands-on familiarity with one or more of LLM observability and evaluation tooling such as OpenTelemetry, LangSmith, Weights & Biases, Arize/Phoenix, or equivalent in-house systems.
Experience deploying and operating AI systems on AWS (Bedrock, EC2, Lambda, ECS/EKS, API Gateway, S3, CloudWatch), with a focus on reliability, security, and cost-aware production usage.
Nice-to-Haves
Familiarity with Docker, Kubernetes, CI/CD, and continuous deployment in production environments.
Experience with search and retrieval systems such as AWS Kendra, OpenSearch, Weaviate, Qdrant, or Pinecone.
Ability to build simple internal-facing UIs or tools (React, Streamlit).
Experience building reusable SDKs, internal AI platforms, or shared developer frameworks.
Who You Are
You have 5+ years of overall experience in software engineering/ML engineering, with at least 2 years building GenAI systems in production.
You ship real production systems, not just prototypes.
You operate at the intersection of AI, engineering, and operations.
You think in systems: reliability, observability, cost, and scale.
You work independently, own problems end-to-end, and simplify complexity.
You prioritise safety, interpretability, and security in every AI system you build.
- Department
- Technology
- Locations
- London, Manchester, Oxford, Remote, Dubai, New York, Buenos Aires, Indore
- Remote status
- Fully Remote
About Oxford DataPlan
The Home of Alternative Data. We use alternative data and data science to deliver near real-time estimates of revenue and other key performance indicators for 200+ publicly listed companies globally — updating daily.
Already working at Oxford DataPlan?
Let’s recruit together and find your next colleague.