Stripe

Stripe Software Engineer Phone Screen Questions

29+ questions from real Stripe Software Engineer Phone Screen rounds, reported by candidates who interviewed there.

29
Questions
8
Topic Areas
10+
Sources

What does the Stripe Phone Screen round test?

The Stripe phone screen typically lasts 45-60 minutes and evaluates core Software Engineer fundamentals. Candidates should expect 1-2 algorithmic problems, basic system design discussion at senior levels, and questions about relevant experience. The goal is to confirm technical competence before bringing candidates onsite.

Top Topics in This Round

Stripe Software Engineer Phone Screen Questions

Stripe phone interview - a less common question, similar to one in a previous interview preparation package. Paste it out. Question 5: Write down all the reasons why previous entries were not verified

Stripe FullStack Tech Phone Screen Interview Experience

Arrays,Strings,Simulation,General Experience 2025

I want to see the VO interview experience. I just had my interview. The question was the old one, shipping cost. Because I had seen it before, I wrot

Stripe SDE Intern Phone Screen Coding Interview Experience

Phone Screen, Tiered Pricing, Shipping, Stripe 2025

**Coding 1: Fixed Unit Shipping Fee** * **Problem Statement:** The objective is to compute the total shipping cost for a batch of orders. Each order consists of a specific country, product, and quanti

Stripe - Phone screen

Algorithms 2024

#### Part 1 ### #### In an HTTP request, the Accept-Language header describes the list of #### languages that the requester would like content to be returned in. The header #### takes...

Hey folks, I am writing this to help the community which i benefited from drastically. My company shut down. I started my job search for SDE 3 in India. I was fortunate to...

Part 1: You are given a string representing application IDs in the following format: Each application ID is prefixed by its length (number of characters in the ID). The format is: lengthOfApplicationId +...

Part-1 Customer log: "Y Y N Y" //represents if customer come to store every hour, Y means come, N means no. Closing Time: Store is closed at a given hour So need to...

I had a technical phone screen with Stripe yesterday, and quite quickly received a rejection email this morning. I solved 2 questions on my phone screen, used descriptive function & variable names

## Problem You receive card data as a raw string in the following format: ``` <name>|<suit>|<value> ``` where `suit` is one of `H, D, C, S` and `value` is an integer 1-13 (1=Ace, 11=Jack, 12=Queen, 13=King). Parse a list of such strings and return a list of card dicts. Reject (skip) any malformed entries and return them separately. ```python from typing import List, Tuple, Dict def parse_cards( raw: List[str] ) -> Tuple[List[Dict], List[str]]: # return (valid_cards, rejected_strings) # valid card: {"name": str, "suit": str, "value": int} pass ``` **Example:** ``` Input: ["Ace|H|1", "King|S|13", "Joker|X|0", "Queen|D|12", "bad_data"] Valid: [{"name":"Ace","suit":"H","value":1}, {"name":"King","suit":"S","value":13}, {"name":"Queen","suit":"D","value":12}] Rejected: ["Joker|X|0", "bad_data"] ``` ## Follow-ups 1. How would you extend this to support a multi-deck scenario where duplicate cards are valid? 2. What error reporting format would be most useful for a downstream API consumer? 3. How would you make the parser configurable to support different card game variants with different valid suits or value ranges? 4. How would you write property-based tests for this parser?

## Problem You have `n` stores, each with a closing time (in minutes from midnight). A region "closes" when all stores are closed. However, stores can be forced to close early by paying a penalty of `P` dollars per minute of early closure. Given a budget `B`, find the earliest possible region closing time. You may distribute early closure across any stores. ```python from typing import List def earliest_close(closing_times: List[int], P: int, B: int) -> int: # return the earliest achievable region closing time (in minutes) pass ``` **Example:** ``` closing_times = [300, 360, 420] P = 10, B = 1000 Target T=300: need to reduce 360->300 (60 min * $10=$600) and 420->300 (120 min * $10=$1200). Total=$1800 > $1000. Target T=330: reduce 360->330 ($300) and 420->330 ($900). Total=$1200 > $1000. Target T=350: reduce 360->350 ($100) and 420->350 ($700). Total=$800 <= $1000. Output: 350 ``` ## Approach Binary search on target closing time T. For a given T, compute total cost = sum of `max(0, t - T) * P` for all stores. Time: O(n log(max_time)). ## Follow-ups 1. What if different stores have different per-minute penalty rates? 2. How would you find the minimum budget required to close by a given target time? 3. What if some stores cannot be forced to close early at all? 4. How does the problem change if stores also have an earliest possible open time and must remain open for a minimum duration?

## Problem Implement two functions: 1. `is_valid(card_number: str) -> bool` — validate using the Luhn algorithm. 2. `mask(card_number: str) -> str` — replace all but the last 4 digits with `*`. The Luhn check: from the rightmost digit, double every second digit; if doubling exceeds 9, subtract 9. Sum all digits. Valid if sum % 10 == 0. ```python def is_valid(card_number: str) -> bool: pass def mask(card_number: str) -> str: pass ``` **Example:** ``` card = "4532015112830366" is_valid(card) -> True mask(card) -> "************0366" card = "1234567890123456" is_valid(card) -> False mask(card) -> "************3456" ``` **Luhn trace for "4532015112830366":** ``` Digits (R->L doubled): 6, 6, 3, 0, 3, 8, 2, 2, 1, 1, 5, 0, 2, 3, 5, 8 Sum = 62 -- wait, corrected sum = 60 -> 60 % 10 == 0 -> valid ``` ## Follow-ups 1. How do you handle input with spaces or dashes (e.g. `"4532 0151 1283 0366"`)? 2. What are the valid length ranges for Visa, Mastercard, and Amex cards? 3. Beyond Luhn, what additional checks would you add in a production payment system? 4. How would you write a Luhn-valid card number generator for testing purposes?

## Problem You are given a list of direct exchange rates between currency pairs. Given a source currency and a target currency, find the effective exchange rate through any chain of conversions, or return -1.0 if no path exists. ```python from typing import List def find_exchange_rate( rates: List[List], queries: List[List[str]] ) -> List[float]: # rates: [["USD","EUR", 0.92], ["EUR","GBP", 0.86], ...] # queries: [["USD","GBP"], ...] # return: effective rate for each query, or -1.0 pass ``` **Example:** ``` rates = [["USD","EUR",0.92],["EUR","GBP",0.86]] queries = [["USD","GBP"],["GBP","USD"],["USD","JPY"]] USD->GBP: 0.92 * 0.86 = 0.7912 GBP->USD: 1/(0.92*0.86) = 1.264 USD->JPY: no path -> -1.0 Output: [0.7912, 1.264, -1.0] ``` ## Approach Model as a weighted directed graph (add both directions: A->B with rate r, B->A with rate 1/r). BFS/DFS with multiplicative path cost, or Floyd-Warshall for all pairs. ## Follow-ups 1. How do you detect an arbitrage opportunity (a cycle whose product > 1.0)? 2. How would you handle rate updates in real time without recomputing the full graph? 3. What numerical precision issues arise when chaining many floating-point multiplications? 4. How would you find the path with maximum rate (best exchange route) vs. any valid path?

## Problem Implement a schema validator. A schema is a Python dict specifying expected types and constraints. Validate a given payload against the schema and return a list of validation errors (empty list if valid). Supported schema keys: `"type"` ("str", "int", "float", "list", "dict"), `"required"` (bool), `"min"` / `"max"` (for int/float), `"fields"` (nested schema for dict type). ```python from typing import Any, List def validate(payload: Any, schema: dict) -> List[str]: # return list of error messages, empty if valid pass ``` **Example:** ``` schema = { "type": "dict", "fields": { "age": {"type": "int", "required": True, "min": 0, "max": 150}, "name": {"type": "str", "required": True} } } validate({"age": 200, "name": "Ada"}, schema) -> ["age: value 200 exceeds max 150"] validate({"age": 25}, schema) -> ["name: required field missing"] ``` ## Follow-ups 1. How would you support list-element schemas (validate every item in a list against a sub-schema)? 2. How would you support `"one_of"` / union types? 3. How would you generate a human-readable diff showing exactly which fields failed and why? 4. How does this compare to JSON Schema (draft-07) — what features would you need to add to reach parity?

## Problem You are given a run-length encoded string in the format `"<count><char>"` repeating (e.g. `"3a2b1c"` -> `"aaabbc"`). Implement a decoder. Additionally implement the encoder: given a plain string, produce its run-length encoding. ```python def decode(s: str) -> str: # "3a2b4c" -> "aaabbcccc" pass def encode(s: str) -> str: # "aaabbc" -> "3a2b1c" pass ``` **Example:** ``` decode("3a2b1c") -> "aaabbc" decode("1a1b1c") -> "abc" encode("aaabbc") -> "3a2b1c" encode("abcd") -> "1a1b1c1d" ``` **Edge cases:** - Count can be multi-digit: `"12a"` -> twelve `a`'s. - Empty string input returns empty string. ## Follow-ups 1. How do you handle malformed input (non-digit before char, count of 0)? 2. When does run-length encoding actually make a string larger? Give a precise condition. 3. How would you extend this to encode/decode binary data (not just ASCII)? 4. How would you implement an in-place encoder without allocating a new string?

## Problem Design an OOP model for a manufacturing factory. A `Factory` has multiple `ProductionLine`s. Each `ProductionLine` has a list of `Machine`s. Each `Machine` has an hourly operating cost and an output rate (units/hour). Implement: - `Factory.total_cost(hours)` — total operating cost across all machines for `hours` hours. - `Factory.total_output(hours)` — total units produced. - `Factory.cost_per_unit(hours)` — total cost / total output (raise `ValueError` if output is 0). ```python class Machine: def __init__(self, hourly_cost: float, output_rate: float): ... class ProductionLine: def __init__(self, machines: list): ... class Factory: def __init__(self, lines: list): ... def total_cost(self, hours: float) -> float: ... def total_output(self, hours: float) -> float: ... def cost_per_unit(self, hours: float) -> float: ... ``` **Example:** ``` m1 = Machine(hourly_cost=50, output_rate=100) m2 = Machine(hourly_cost=30, output_rate=60) line = ProductionLine([m1, m2]) factory = Factory([line]) factory.total_cost(8) -> 640.0 factory.total_output(8) -> 1280.0 factory.cost_per_unit(8) -> 0.5 ``` ## Follow-ups 1. How would you add machine downtime (a machine is idle for some fraction of hours)? 2. How would you model a machine that has a fixed startup cost in addition to hourly cost? 3. How would you make this serializable to JSON for persistence? 4. How would you design a `simulate` method that runs production hour by hour and logs output?

## Problem You are given a list of fraud reports, each with a `reporter_id`, `reported_user_id`, and `timestamp`. Find all users with more than `threshold` unique reporters within any rolling 24-hour window. Return a list of `(reported_user_id, max_unique_reporters_in_any_window)` sorted by count descending, then by user id ascending. ```python from typing import List, Tuple def flagged_users( reports: List[Tuple[str, str, int]], threshold: int ) -> List[Tuple[str, int]]: # reports: [(reporter_id, reported_user_id, timestamp_sec), ...] # return: [(reported_user_id, max_unique_reporters), ...] pass ``` **Example:** ``` reports = [ ("u1","bob",0), ("u2","bob",3600), ("u3","bob",7200), ("u1","bob",90000), # outside 24h window of t=0 ("u1","alice",0) ] threshold = 2 bob: window [0,86400) has 3 unique reporters -> flagged (count=3) alice: window [0,86400) has 1 -> not flagged Output: [("bob", 3)] ``` ## Follow-ups 1. What if a single reporter can submit multiple reports and they should only count once per window? 2. How do you scale this to process 10M reports/day in near real-time? 3. How would you prevent false positives from coordinated mass-reporting attacks by a single group? 4. What additional signals would you combine with report count to improve fraud detection accuracy?

## Problem Given a raw HTTP request string, parse it into: `method`, `path`, `http_version`, and a `headers` dict. Header names are case-insensitive; normalize them to lowercase. Stop at the blank line separating headers from body. ```python def parse_request(raw: str) -> dict: # return {"method": str, "path": str, "version": str, "headers": dict} pass ``` **Example:** ``` raw = ( "GET /api/users HTTP/1.1\r\n" "Host: example.com\r\n" "Content-Type: application/json\r\n" "Authorization: Bearer token123\r\n" "\r\n" "{\"key\": \"value\"}" ) Output: { "method": "GET", "path": "/api/users", "version": "HTTP/1.1", "headers": { "host": "example.com", "content-type": "application/json", "authorization": "Bearer token123" } } ``` ## Follow-ups 1. How do you handle multi-value headers (e.g. multiple `Set-Cookie` lines) — list or last-wins? 2. What happens when a header line has no `:` separator — should you skip it or raise an error? 3. How would you extend this to also parse and return the request body? 4. How do you guard against maliciously large headers (header injection / DoS)?

## Problem You are given a list of user accounts. Each account has a user ID and a list of contact values (emails or phone numbers). Two accounts are "linked" if they share at least one contact value. Find all groups of linked accounts (connected components). Return groups as lists of sorted user IDs, sorted by the smallest ID in each group. ```python from typing import List def linked_groups(accounts: List[List]) -> List[List[int]]: # accounts: [[user_id, contact1, contact2, ...], ...] # return: [[user_ids in group], ...] sorted pass ``` **Example:** ``` accounts = [ [1, "[email protected]", "555-1234"], [2, "[email protected]"], [3, "[email protected]"], [4, "[email protected]", "555-9999"] ] User 1 and 3 share "[email protected]" -> linked User 2 and 4 share "[email protected]" -> linked Output: [[1, 3], [2, 4]] ``` ## Approach Union-Find: map each contact value to the first user ID that owns it; union subsequent users with the same contact to that user. ## Follow-ups 1. What if a single user can be in multiple link chains — does Union-Find still work? 2. How would you update the grouping incrementally as new accounts are registered? 3. What privacy implications should you consider before building this feature into a production system? 4. How would you handle case-insensitive matching for email addresses?

## Problem You are given two address books, each as a list of contact records. Each contact has a `name` and a list of `emails`. Two contacts match if they share at least one email address. Return all matching pairs as `(name_from_book_a, name_from_book_b)`, sorted alphabetically by the first name, then second. ```python from typing import List, Tuple def find_matches( book_a: List[dict], book_b: List[dict] ) -> List[Tuple[str, str]]: # contact: {"name": str, "emails": [str, ...]} pass ``` **Example:** ``` book_a = [{"name": "Alice", "emails": ["[email protected]", "[email protected]"]}, {"name": "Bob", "emails": ["[email protected]"]}] book_b = [{"name": "Alicia", "emails": ["[email protected]"]}, {"name": "Robert", "emails": ["[email protected]"]}] Alice and Alicia share "[email protected]" -> match Bob and Robert share nothing -> no match Output: [("Alice", "Alicia")] ``` ## Follow-ups 1. How would you handle fuzzy name matching (e.g., "Bob" vs. "Bobby") as an additional signal? 2. What if email addresses have different capitalizations or aliasing (e.g., gmail ignores dots)? 3. How would you scale this to match two books of 10 million contacts each? 4. How would you rank matches by confidence when multiple signals (email, phone, name similarity) are available?

## Problem Given a list of financial transactions, merge overlapping or duplicate entries into a consolidated result. ## Tags arrays, hash_table, sorting

See All 29 Questions from This Round

Full question text, answer context, and frequency data for subscribers.

Get Access