Jump Trading

Jump Trading Software Engineer Interview Questions

6+ questions from real Jump Trading Software Engineer interviews, reported by candidates.

6
Questions
3
Round Types
4
Topic Areas
2025
Year Range

Round Types

OA 3 Phone 2 Phone Screen 1

Top Topics

Questions

Jump Trading Quant Dev Interview Experience A debugging question, a somewhat strange one. See the screenshot. There are 12 test cases in total. Good luck and lots of points!!!!!!!!!!!!!!! Consider a w

## Problem You are given a hand of cards. Each card has three attributes: `color` (red/green/blue), `shape` (oval/diamond/squiggle), and `count` (1/2/3). A valid "set" is any group of 3 cards where, for each attribute, the values across the 3 cards are either all the same or all different. Given a list of cards, find all valid sets in the hand. ```python from dataclasses import dataclass @dataclass class Card: color: str shape: str count: int def find_sets(hand: list[Card]) -> list[tuple[Card, Card, Card]]: ... ``` ``` Cards: A=(red, oval, 1) B=(green, oval, 2) C=(blue, oval, 3) D=(red, diamond, 2) (A,B,C): color=all diff, shape=all same, count=all diff -> VALID SET (A,B,D): color=all diff, shape=2 same 1 diff -> INVALID Output: [(A,B,C)] ``` ## Follow-ups 1. Your brute-force is O(n^3). For a 12-card layout, how many triples are there? Is O(n^3) acceptable in practice? 2. How would you detect if a hand has NO valid set (used in the game to trigger a redeal)? 3. Add a 4th attribute `fill` (solid/striped/empty). How does this change your validity check? 4. Given that each attribute has 3 values and there are 4 attributes, what is the maximum deck size, and what is the expected number of sets in a random 12-card deal?

## Round 1 - SQL ## Problem You have a table `file_executions` that logs every time a binary is run on a system: ```sql CREATE TABLE file_executions ( exec_id BIGINT PRIMARY KEY, filename VARCHAR(255), user_id INT, run_at TIMESTAMP, duration_ms INT, exit_code INT ); ``` **Q1:** Find the top 5 most frequently executed files in the last 30 days. **Q2:** For each user, find the file they ran most often, and how many times. **Q3:** Find files that were run by more than 10 distinct users but had a non-zero exit code (failure) on more than 50% of executions. **Q4:** Compute the 7-day rolling average execution duration per file. ## Follow-ups 1. Q2 requires per-user ranking — write it with `ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY run_count DESC)`. 2. Q3's 50% failure filter: should you compute failure rate before or after the distinct-user filter? Why does order matter? 3. For Q4, the rolling window must only include rows within 7 days of each row's `run_at`. Write the `RANGE BETWEEN` clause. 4. If this table has 500 million rows, which indexes would you add and why?

## Problem A set of conceptual questions covering networking and operating systems fundamentals, as asked in onsite and phone screens. **Networking:** - Walk through the full lifecycle of an HTTP request from typing a URL to receiving HTML. Name every protocol involved. - What happens during a TCP 3-way handshake? What state is the server socket in before and after? - Explain the difference between TCP and UDP. When would you choose UDP for a production system? - What is a TIME_WAIT state and why does it exist? How can it cause port exhaustion? **Operating Systems:** - What is the difference between a process and a thread? What is shared vs isolated between threads in the same process? - Describe virtual memory. Why can two processes both "have" address 0x1000 without conflicting? - What is a context switch, what triggers it, and what does the OS save/restore? - Explain the difference between a mutex and a semaphore. Give a scenario where each is the right choice. ## Follow-ups 1. If `ping` succeeds but HTTP fails to the same host, what are the possible causes? 2. You have a program that runs fine single-threaded but crashes intermittently with multiple threads. What tools and techniques do you use to diagnose it? 3. Explain copy-on-write (COW) in the context of `fork()`. Why is it an optimization? 4. What is a page fault, and when is it a problem vs expected behavior?

## Problem Implement a symbol tracker for a simple programming language. Given a list of events (declarations and usages), answer queries about symbol visibility and usage. Events: `DECLARE(scope, name, type)`, `USE(scope, name)`, `ENTER_SCOPE(scope_id)`, `EXIT_SCOPE(scope_id)`. Scopes are nested; inner scopes can see outer scope symbols. A symbol declared in an inner scope shadows an outer one. ```python class SymbolTracker: def declare(self, scope: str, name: str, sym_type: str) -> None: ... def use(self, scope: str, name: str) -> str | None: """Return type of visible symbol, or None if undeclared.""" ... def get_unused_symbols(self) -> list[str]: """Return names declared but never used.""" ... def get_shadowed_symbols(self) -> list[str]: """Return names that shadow an outer declaration.""" ... ``` ``` ENTER outer DECLARE outer x int ENTER inner DECLARE inner x str <- shadows outer x USE inner x -> "str" EXIT inner USE outer x -> "int" get_shadowed_symbols() -> ["x"] ``` ## Follow-ups 1. How do you implement scope lookup (walk up the scope chain)? What data structure represents the chain? 2. What if a symbol can be declared after it's used in the same scope (hoisting, like JS `var`)? How does this change your tracker? 3. How would you extend this to report the exact file/line of each declaration and usage? 4. Describe how a real compiler's symbol table differs from your implementation.

## Problem Implement a `POST /users` endpoint for user registration. The request body is JSON. Your implementation must: 1. Validate required fields: `email` (valid format), `password` (min 8 chars, at least one digit), `username` (alphanumeric, 3-20 chars). 2. Check that the email is not already registered (query a mock DB). 3. Hash the password before storing (do not store plaintext). 4. Return `201 Created` with `{"id": ..., "email": ..., "username": ...}` on success, or `400 Bad Request` with validation errors, or `409 Conflict` if email exists. ```python from flask import Flask, request, jsonify import hashlib, re app = Flask(__name__) USERS_DB = {} @app.route('/users', methods=['POST']) def create_user(): data = request.get_json() # Validate, check duplicate, hash, store, respond ... ``` ``` POST /users {"email":"[email protected]","password":"pass1234","username":"alice"} -> 201 {"id":"uuid","email":"[email protected]","username":"alice"} POST /users {"email":"[email protected]", ...} (duplicate) -> 409 {"error":"email already registered"} ``` ## Follow-ups 1. `hashlib.md5` is not suitable for passwords. What should you use instead and why (bcrypt/argon2)? 2. How do you return multiple validation errors in one response rather than stopping at the first? 3. Rate-limit this endpoint to 5 requests per IP per minute. Where does that logic live? 4. How would you write an integration test for this endpoint that covers the happy path and both error cases?

See All 6 Jump Trading Software Engineer Questions

Full question text, answer context, and frequency data for subscribers.

Get Access