OpenAI

OpenAI Machine Learning Engineer Interview Questions

8+ questions from real OpenAI Machine Learning Engineer interviews, reported by candidates.

8
Questions
2
Round Types
4
Topic Areas
2025
Year Range

Round Types

Phone 1 Coding 1

Top Topics

Questions

I interviewed for an ML infrastructure position. The task was to design a mem allocator manager with a total capacity of N. It included functions like `allocate()` and `free()`. I initially implemente

The Challenge This is a system design question for Machine Learning Engineers. We do not have the full details, but here is the core task: * **Find novel data:** You need to extract new, unique inf

The Challenge You need to build a smart chatbot. This system uses Retrieval-Augmented Generation (RAG) to answer questions from users. Think of it like a business tool (such as Glean) that searches

The Challenge This question asks Machine Learning Engineers to fix a broken **Transformer** model. The code contains **4 bugs**. Comments in the file point out exactly where the mistakes are. Your

Vectorized 1-NN and Neural Network Forward Pass ## Problem Overview This Machine Learning coding interview has been reported as a **two-part question**: 1. Implement **1-nearest-neighbor (1-NN)** us

Data Labeling Task Scheduler ## Problem Overview This OpenAI Machine Learning Engineer question was reported as a **two-part coding problem** about constructing a schedule for data labeling work. *

**Coding: GPU Credit Problem** **Problem Statement** The challenge involves managing GPU credits that have specific start times and expiration times. A common mistake is aggregating total credits firs

## Round 1 - MLE Coding ## Problem Implement the backward pass (backpropagation) for a single fully-connected layer with ReLU activation. You are given the gradient flowing in from the next layer and must compute the gradients with respect to the layer's weights, biases, and input. ```python import numpy as np def relu(x: np.ndarray) -> np.ndarray: ... def relu_grad(x: np.ndarray) -> np.ndarray: ... # 1 where x>0, else 0 def linear_backward( dout: np.ndarray, # gradient from upstream, shape (batch, out_features) x: np.ndarray, # input to this layer, shape (batch, in_features) W: np.ndarray, # weights, shape (in_features, out_features) b: np.ndarray, # biases, shape (out_features,) pre_activation: np.ndarray # Wx + b before ReLU, shape (batch, out_features) ) -> tuple: # Returns (dx, dW, db) ... ``` ``` Shapes check: dout: (32, 128) x: (32, 256) W: (256, 128) pre_activation: (32, 128) dx: (32, 256) <- gradient w.r.t. input dW: (256, 128) <- gradient w.r.t. weights db: (128,) <- gradient w.r.t. bias ``` ## Follow-ups 1. What does multiplying `dout` by `relu_grad(pre_activation)` represent in the chain rule? 2. Why is `dW = x.T @ delta` and not `delta @ x.T`? 3. How would the backward pass change for sigmoid activation instead of ReLU? 4. How would you numerically verify your gradient implementation?

See All 8 OpenAI Machine Learning Engineer Questions

Full question text, answer context, and frequency data for subscribers.

Get Access

Other OpenAI Role Questions