Jane Street

Jane Street Software Engineer Interview Questions

28+ questions from real Jane Street Software Engineer interviews, reported by candidates.

28
Questions
4
Round Types
8
Topic Areas
2016-2026
Year Range

Round Types

Phone 18 Phone Screen 3 Onsite 3 Take Home 1

Top Topics

Questions

Hi, I am interviewing for tdoe intern position at jane street later this week. any pointers on how can I do better and what should I expect in the interview

It's basically writing the algorithm on paper. The problem is essentially a rate-limiting problem. Given a list of IP access log lines with IP address and timestamp, flag the IP addresses that access

This post was last edited by sycstudent on 2025-09-26 12:41 Basic Information: Last year of PhD program in the Midwest (hopefully) I contacted a recruiter from JS during the conference and scheduled a

Last month, I had my first-round interview at Jane Street, similar to the first round for an SDE intern. The main question was a simulation: simulating a 2D chessboard with two types of pieces. The fi

I have an upcoming interview with Jane Street for their IT Operations internship, and I’m not sure what to expect. Most posts about Jane Street focus on their swe and quant roles, but I haven’t found

Has anyone here had an onsite interview at Jane Street for a software engineer position and can comment on the difficulty vs. other big tech companies? There's plenty of information on glassdoor about

Status: Graduating 2025 with MSci Computer Science Position: Graduate Trade Desk Operations Engineer Location: London Date: Jan - Ongoing 2025 I passed the take-home test and the mathematical problem solving interview, and now I...

## Problem Given a matrix of exchange rates between `n` currencies, detect whether an arbitrage opportunity exists — i.e., a cycle of exchanges starting and ending at the same currency that results in a profit. ```python def has_arbitrage(rates: list[list[float]]) -> bool: pass def find_arbitrage_cycle(rates: list[list[float]]) -> list[int] | None: # Return the sequence of currency indices forming the cycle, or None pass ``` ## Example ``` rates = [ # USD EUR GBP [1.0, 0.9, 0.75], # from USD [1.12, 1.0, 0.83], # from EUR [1.35, 1.22, 1.0 ], # from GBP ] # USD -> EUR -> GBP -> USD: # 1.0 * 0.9 * 0.83 * 1.35 = 1.008 > 1.0 -> arbitrage! has_arbitrage(rates) -> True find_arbitrage_cycle(rates) -> [0, 1, 2, 0] # USD->EUR->GBP->USD ``` ## Follow-ups 1. What graph algorithm detects this? Why does taking log of rates convert this to shortest-path? 2. What is the time complexity of your approach for `n` currencies? 3. In practice, bid/ask spreads and transaction fees eat into arbitrage. How do you model them? 4. How would you handle rates that change in real time?

## Problem Arithmetic coding compresses a message into a single number in `[0, 1)` based on character probabilities. Given a symbol frequency table and a message, implement an encoder and decoder. ```python class ArithmeticCoder: def __init__(self, frequencies: dict[str, int]): ... def encode(self, message: str) -> float: # returns a number in [0,1) def decode(self, encoded: float, length: int) -> str: ``` ## Example ``` frequencies = {"a": 3, "b": 2, "c": 1} # total = 6 # Probability ranges: # a: [0, 0.5) # b: [0.5, 0.833) # c: [0.833, 1.0) Encode "ab": Start: [0, 1) After 'a': [0, 0.5) After 'b': [0 + 0.5*0.5, 0 + 0.5*0.833) = [0.25, 0.4165) Output: any value in [0.25, 0.4165), e.g. 0.3 coder.decode(0.3, 2) -> "ab" ``` ## Follow-ups 1. Floating-point precision limits the message length. How does integer arithmetic coding solve this? 2. How does arithmetic coding compare to Huffman coding in compression ratio? 3. How do you handle symbols not seen in the frequency table (zero-probability problem)?

## Problem Parse and evaluate an arithmetic expression string, respecting operator precedence without a built-in eval. ## Likely LeetCode equivalent Similar to Basic Calculator II (LC 227) or Expression Evaluation variants. ## Tags stack, strings, math, jane_street

## Problem Build a JavaScript/TypeScript form validation library. Each field has one or more validation rules. Rules are composable — validators can be chained. When `validate()` is called, return all field errors. ```typescript type Rule = (value: string) => string | null; // null = pass, string = error message class FormValidator { addField(name: string, ...rules: Rule[]): this; validate(data: Record<string, string>): Record<string, string[]>; // field -> errors } // Built-in rules const required: Rule; const minLength = (n: number): Rule => ...; const matches = (pattern: RegExp, msg: string): Rule => ...; ``` ## Example ```typescript const validator = new FormValidator() .addField("email", required, matches(/@/, "must contain @")) .addField("password", required, minLength(8)); validator.validate({ email: "notanemail", password: "short" }); // -> { email: ["must contain @"], password: ["must be at least 8 characters"] } validator.validate({ email: "[email protected]", password: "securepass" }); // -> {} ``` ## Follow-ups 1. How do you support async validators (e.g., checking if a username is taken via API)? 2. How do you implement cross-field validation (e.g., "password" must match "confirm_password")? 3. How would you integrate this with React controlled components?

## Problem Implement a code folding engine for a source code editor. Given source code as a list of lines, detect foldable regions (blocks delimited by `{` and `}`, or by indentation for Python-style code), and support fold/unfold operations. ```python class CodeFolder: def __init__(self, lines: list[str], style: str = "brace"): # style: "brace"|"indent" def get_foldable_ranges(self) -> list[tuple[int, int]]: # (start_line, end_line) def fold(self, start_line: int) -> list[str]: # returns visible lines def unfold(self, start_line: int) -> list[str]: def fold_all(self) -> list[str]: ``` ## Example ``` lines = [ "function foo() {", # line 0 - foldable range (0, 3) " if (x) {", # line 1 - foldable range (1, 2) " return x;", # line 2 " }", # line 3 "}", # line 4 ] get_foldable_ranges() -> [(0, 4), (1, 3)] fold(0) -> ["function foo() { ... }", "}"] ``` ## Follow-ups 1. How do you handle mismatched braces in malformed source code? 2. What data structure efficiently tracks nested folds as the user types? 3. How would you extend this to support language-specific fold points (e.g., `if/else`, `try/catch`)?

## Round 1 - System Design ## Problem Design a multi-asset exchange platform where users can trade stocks, ETFs, and crypto. The system must route incoming orders to the correct order book per asset, match orders, manage user balances, and produce trade confirmations. ## Core Components - **Order Router**: parses incoming orders (`{user_id, asset, side, price, qty, type}`), routes to the correct order book instance. - **Order Book**: per-asset, price-time priority matching (same as a single matching engine). - **Balance Ledger**: tracks per-user per-asset holdings; validates sufficient balance before accepting an order. - **Trade Confirmation Service**: emits trade events to users via WebSocket. - **Risk Engine**: pre-trade checks (margin, position limits, wash-trade detection). ## Follow-ups 1. How do you prevent double-spend races when a user submits two orders simultaneously? 2. How do you design the balance ledger for consistency — is optimistic locking or two-phase commit better? 3. What is your failover strategy if one order book node crashes mid-match? 4. How do you replay all trades from a checkpoint to rebuild balances after a crash?

## Problem Design a `GamepadTracker` that records sequences of button/axis inputs with timestamps and can replay them to check whether a recorded combo was input within a timing window. Support: ```python class GamepadTracker: def record(self, input_event: dict) -> None: # event = {"type": "button"|"axis", "id": str, "value": float, "ts": int} def detect_combo( self, combo: list[dict], # sequence of events to match window_ms: int # all events must occur within this window ) -> bool: def get_recent(self, duration_ms: int) -> list[dict]: ``` ## Example ``` tracker.record({"type": "button", "id": "A", "value": 1, "ts": 1000}) tracker.record({"type": "button", "id": "B", "value": 1, "ts": 1050}) tracker.record({"type": "button", "id": "A", "value": 1, "ts": 1200}) combo = [{"id": "A"}, {"id": "B"}] tracker.detect_combo(combo, window_ms=200) # -> True (A at 1000, B at 1050: both within 200ms) ``` ## Follow-ups 1. How do you handle axis inputs where the value must exceed a threshold (e.g., joystick > 0.8)? 2. What data structure enables efficient sliding-window combo detection as new events arrive? 3. How would you extend this to support approximate matching (one missed input allowed)?

## Problem Design a game board that is conceptually infinite in all directions. Players and entities occupy cells on the grid. The board must support: ```python class InfiniteBoard: def place(self, entity_id: str, x: int, y: int) -> None: def move(self, entity_id: str, dx: int, dy: int) -> bool: # False if cell occupied def get_position(self, entity_id: str) -> tuple[int, int] | None: def get_entities_in_range(self, x: int, y: int, radius: int) -> list[str]: def get_neighbors(self, entity_id: str) -> list[str]: # adjacent 8 cells ``` ## Example ``` board.place("hero", 0, 0) board.place("goblin", 1, 0) board.move("hero", 1, 0) -> False # goblin is there board.move("goblin", 5, 5) board.move("hero", 1, 0) -> True # now clear board.get_entities_in_range(0, 0, 10) # -> ["hero", "goblin"] (goblin at (6,5) is within radius 10 of (0,0)) ``` ## Follow-ups 1. What data structure (hashmap, quadtree, spatial hash) is best for this use case? Under what entity densities? 2. How do you implement chunk-based loading as entities explore new areas? 3. How would you synchronize board state across multiple game server nodes?

## Round 1 - System Design ## Problem Design a live streaming platform (think Twitch/YouTube Live) supporting: - Streamers push video to an ingest server via RTMP - Viewers watch the stream with <5 second latency at any scale - Real-time chat: millions of viewers sending messages to the same stream - Live viewer count updated every second ## Key Design Areas - **Ingest and transcoding**: RTMP ingest -> transcode to HLS/DASH at multiple bitrates -> push to CDN edge - **Delivery**: adaptive bitrate over HLS; CDN handles viewer scale - **Chat**: fan-out problem — one message to N viewers. Use a pub-sub bus (Kafka/Redis pub-sub) per stream; clients subscribe via WebSocket to a chat gateway. - **Viewer count**: approximate real-time count using HyperLogLog or counter per server aggregated every second. ## Follow-ups 1. The stream suddenly gets 5M concurrent viewers — what breaks first and how do you recover? 2. How do you enforce chat moderation rules (rate limits, banned words) without adding latency? 3. How do you store the VOD (video on demand) after the stream ends? 4. How do you handle a streamer going live from a very low-bandwidth connection?

## Problem Implement a three-way merge: given a **base** version and two branches (`ours` and `theirs`), produce a merged result. If both branches changed the same lines differently, mark as a conflict. Changes to non-overlapping lines are merged automatically. ```python def three_way_merge( base: list[str], ours: list[str], theirs: list[str] ) -> list[str]: # Conflict regions marked as: # ["<<<<<<< ours", ...ours lines..., "=======", ...theirs lines..., ">>>>>>> theirs"] pass ``` ## Example ``` base = ["a", "b", "c", "d"] ours = ["a", "X", "c", "d"] # changed line 2: b->X theirs = ["a", "b", "c", "Y"] # changed line 4: d->Y three_way_merge(base, ours, theirs) # -> ["a", "X", "c", "Y"] (no conflict: different lines changed) ours2 = ["a", "X", "c", "d"] theirs2 = ["a", "Z", "c", "d"] # -> ["a", "<<<<<<< ours", "X", "=======", "Z", ">>>>>>> theirs", "c", "d"] ``` ## Follow-ups 1. What diff algorithm do you use as the foundation (LCS, Myers diff)? 2. How do you handle moves (a block deleted in one place and inserted elsewhere)? 3. How does this extend to structured formats like JSON or XML where line-level diffs lose semantics?

## Problem Design a multi-level or multi-tenant cache system with configurable eviction and access policies. ## Likely LeetCode equivalent Extends LRU Cache (LC 146) to multi-level design. ## Tags hash_table, design, jane_street, cache

## Problem You receive order book updates from two market data feeds (Feed A and Feed B) for the same instrument. Each update is `(timestamp, side, price, quantity)`. Implement a comparator that detects discrepancies: same-timestamp updates where the two feeds disagree on price levels or quantities. ```python class OrderBookComparator: def ingest_a(self, update: dict) -> None: # {"ts": int, "side": str, "price": float, "qty": float} def ingest_b(self, update: dict) -> None: def get_discrepancies( self, tolerance_ms: int = 50, qty_tolerance: float = 0.01 ) -> list[dict]: ``` ## Example ``` comp.ingest_a({"ts": 1000, "side": "bid", "price": 100.0, "qty": 10.0}) comp.ingest_b({"ts": 1000, "side": "bid", "price": 100.0, "qty": 9.5}) comp.get_discrepancies(qty_tolerance=0.1) # -> [{"ts": 1000, "side": "bid", "price": 100.0, "feed_a_qty": 10.0, "feed_b_qty": 9.5}] # qty diff = 0.5 > 0.1 tolerance ``` ## Follow-ups 1. Feed B often lags by 20-100ms — how do you match updates across feeds with timing skew? 2. What ring-buffer or sliding-window data structure best supports this real-time comparison? 3. How do you alert on systematic bias (Feed B is always lower) vs. random noise?

## Problem Implement a fixed-size circular ring buffer with enqueue, dequeue, and overflow handling. ## Likely LeetCode equivalent Similar to Design Circular Queue (LC 622). ## Tags queue, design, jane_street

See All 28 Jane Street Software Engineer Questions

Full question text, answer context, and frequency data for subscribers.

Get Access