Most of my experience so far has been with CPU compilers. I’m used to thinking about traditional compiler pipelines that start from source code, parse it into an AST, lower it to an intermediate representation (IR), optimize it, and then generate machine code. The work usually revolves around control flow, register allocation, and data dependencies. It’s all about turning human-written code into something that runs efficiently on a CPU.

But machine learning (ML) compilers operate in a completely different world. They don’t take in a programming language as input. Instead, they start with computational graphs – dataflow representations of neural networks built in frameworks like PyTorch or TensorFlow. These compilers don’t worry about parsing syntax or resolving types. Their job is to optimize large graphs of tensor operations and then map those computations efficiently onto specialized hardware like GPUs, TPUs, or custom accelerators.

In other words, while CPU compilers focus on instruction-level performance, ML compilers are about data movement, parallelism, and hardware scheduling. They fuse multiple operations to reduce memory transfers, reorder computations to better fit hardware pipelines, and automatically split workloads across different compute units. It’s still compiler work, but the scale and goals are completely different.

Why MLIR caught my attention

As I started reading about ML compilers, one acronym kept popping up: MLIR (Multi-Level Intermediate Representation). Originally developed at Google, MLIR is now used across the industry as the backbone for many ML compiler stacks. It’s designed around a simple but powerful idea – instead of having one monolithic IR, you can define dialects, each representing a specific level of abstraction.

That modular design is what makes MLIR so versatile. You can start with a high-level dialect that represents tensor operations, gradually lower it through more hardware-aware dialects, and eventually reach LLVM IR for code generation. It gives you fine-grained control over how computations evolve through the pipeline, which feels incredibly elegant if you come from a traditional compiler background.

A lot of companies are now betting on MLIR. Google’s IREE uses it to compile ML models for a wide range of devices. AMD builds on MLIR in its ROCm and SHARK toolchains. Even open-source projects like Polygeist and Torch-MLIR rely on it to bridge Python-based ML frameworks with low-level code generation. Seeing that kind of adoption made it clear that MLIR wasn’t just an academic experiment – it’s becoming the standard foundation for modern ML compilers.

Learning the ropes

I started with Alex Singer’s beginner-friendly MLIR series – it’s a gentle on-ramp that explains the why before the how. It also forced me to brush up on a few things I’d taken for granted: what tensors actually are, how matrix multiplication works, what “tensor ops” look like in an IR, and how a computation graph flows.

The tricky part wasn’t finding material – it was deciding how deep to go without getting lost in the weeds or drifting away from the main goal. My aim here is a working, high-level understanding of ML concepts that feeds back into MLIR basics, not a full detour into ML theory.

That balance is risky – if later sections stop making sense, it’s hard to pinpoint which gap is to blame – which is why I’m documenting the journey so other compiler folks can calibrate what to learn and how far to learn it.

Keeping the study loop tight

  • Refresh the math just enough to move forward. For matrix multiplication, I revisited dimensions, row-by-column rules, and why fusion matters for data movement. Short, visual explainers like 3Blue1Brown’s Essence of Linear Algebra series help me build intuition quickly without going overboard.

  • Anchor ML concepts to compiler thinking. When I learn about tensors or operations, I always ask, “how does this appear in a dialect or a pass?” That mindset keeps me grounded and helps connect new ML ideas to familiar compiler concepts.

  • Practice on real MLIR examples early. I’m still working through the Toy Tutorial from the official MLIR documentation. It walks you through building a tiny language and gradually lowering it through multiple IR levels. It’s hands-on, and seeing those abstractions turn into actual transformations makes a big difference.

Resources that helped

Will add more to the list in the future

  • MLIR Toy Tutorial (official docs) – a step-by-step guide that turns abstract MLIR concepts into tangible code.
  • Alex Singer’s MLIR Series – a beginner-friendly walkthrough of MLIR concepts with short, practical examples.
  • 3Blue1Brown’s Essence of Linear Algebra playlist – perfect for refreshing linear algebra intuition and visualizing matrix operations.
  • Tensors Explained Simply (StatQuest) – a straightforward video on what tensors are and how they’re used in ML.
  • LLVM Discourse: MLIR Category – a helpful community forum where people share discussions, design questions, and beginner tips.

What’s next

For now, my goal is to get fully comfortable with MLIR’s core concepts – dialects, passes, and transformations – and then explore how frameworks like PyDSL and IREE use it in practice. Coming from CPU compilers, MLIR feels familiar in structure but different in spirit. It’s still about turning something high-level into something efficient, but the “something” here is no longer a program – it’s a computation graph that needs to be scheduled and optimized across real hardware.

It’s been refreshing to learn something that connects compiler design with the world of AI systems. If you’re also a compiler person curious about MLIR, I’d say start with the Toy tutorial and the docs, ask questions in the community, and just start experimenting. It’s one of the best ways to bridge the gap between traditional compiler work and the future of machine learning infrastructure.