MongoDB Software Engineer Phone Screen Questions
19+ questions from real MongoDB Software Engineer Phone Screen rounds, reported by candidates who interviewed there.
What does the MongoDB Phone Screen round test?
The MongoDB phone screen typically lasts 45-60 minutes and evaluates core Software Engineer fundamentals. Candidates should expect 1-2 algorithmic problems, basic system design discussion at senior levels, and questions about relevant experience. The goal is to confirm technical competence before bringing candidates onsite.
Top Topics in This Round
MongoDB Software Engineer Phone Screen Questions
## Round 1 - Coding ## Problem Design a billing system that creates invoices, adds line items, applies discounts, and computes totals with tax. ```python class LineItem: def __init__(self, description: str, quantity: int, unit_price: float): ... class Invoice: def __init__(self, invoice_id: str, customer: str, tax_rate: float): ... def add_item(self, item: LineItem) -> None: ... def apply_discount(self, percent: float) -> None: # on subtotal ... def subtotal(self) -> float: ... def total(self) -> float: # after discount and tax ... def summary(self) -> str: ... ``` ## Example ``` inv = Invoice("INV-001", "Acme Corp", tax_rate=0.08) inv.add_item(LineItem("Widget", 5, 10.00)) inv.add_item(LineItem("Gadget", 2, 25.00)) inv.subtotal() -> 100.0 # 5*10 + 2*25 inv.apply_discount(10) # 10% off inv.total() -> 97.20 # 90 * 1.08 ``` ## Follow-ups 1. How do you handle discounts that apply only to specific line items versus the entire invoice? 2. How would you represent recurring invoices that auto-generate monthly? 3. What happens if `apply_discount` is called multiple times — do they stack or replace? 4. How do you serialize an invoice to JSON for API responses?
## Problem Decode a binary-encoded string or number back to its original form, likely involving bit manipulation or base conversion. ## Likely LeetCode equivalent No unambiguous single LC equivalent. ## Tags binary,math,strings,swe
## Round 1 - Coding ## Problem Given a social network as a list of friendships, recommend potential friends for a user. Recommendations are users who are not already friends with the target user but share the most mutual friends. Exclude the user themselves. ```python def recommend_friends(friendships: list[tuple[str,str]], user: str, top_k: int) -> list[str]: # returns top_k recommended users sorted by mutual friend count desc, # ties broken alphabetically ... ``` ## Example ``` friendships = [ ("Alice","Bob"),("Alice","Carol"),("Bob","David"), ("Carol","David"),("David","Eve"),("Alice","Eve") ] recommend_friends(friendships, "Alice", top_k=2) # Alice's friends: {Bob, Carol, Eve} # David: shares Bob, Carol, Eve -> 2 mutuals (Bob,Carol) [Eve shares but Alice-Eve exists] # Actually David shares Bob and Carol with Alice -> 2 mutuals # -> ["David", ...] ``` ## Follow-ups 1. What is the time complexity of your approach? Can you reduce it using adjacency sets? 2. How would you handle a directed graph (follow vs. friend) instead of an undirected one? 3. How would you extend recommendations to include second-degree connections with lower weight? 4. At scale (100M users), how do you compute mutual friend counts efficiently?
MongoDB SWE Phone - Hash Map
## Problem Implement a hash map from scratch, covering put, get, and remove operations with collision handling. ## Likely LeetCode equivalent LeetCode 706 - Design HashMap. ## Tags hash_table,design,swe
MongoDB SWE Phone - Integers Window
## Problem Process a sliding window of integers, likely computing a running statistic (max, sum, or average) over a fixed-size window. ## Likely LeetCode equivalent LeetCode 239 - Sliding Window Maximum. ## Tags sliding_window,heap,arrays,swe
## Problem Find the intersection of two arrays or sets, returning elements that appear in both. ## Likely LeetCode equivalent LeetCode 349 - Intersection of Two Arrays. ## Tags arrays,hash_table,two_pointers,swe
## Round 1 - Coding ## Problem Build an inverted index over a collection of documents. Support adding documents, querying for documents containing a single word, and AND queries returning documents that contain all given words. ```python class InvertedIndex: def add_document(self, doc_id: int, text: str) -> None: ... def search(self, word: str) -> set[int]: # returns set of doc_ids containing word (case-insensitive) ... def search_all(self, words: list[str]) -> set[int]: # AND query: doc must contain every word ... def search_any(self, words: list[str]) -> set[int]: # OR query: doc must contain at least one word ... ``` ## Example ``` idx = InvertedIndex() idx.add_document(1, "the quick brown fox") idx.add_document(2, "the fox jumped over") idx.add_document(3, "a quick brown dog") idx.search("fox") -> {1, 2} idx.search_all(["quick","brown"]) -> {1, 3} idx.search_any(["fox","dog"]) -> {1, 2, 3} ``` ## Follow-ups 1. How do you handle common stop words like "the" or "a"? Should they be indexed? 2. How would you extend the index to store word positions so you can support phrase queries like `"quick brown"`? 3. How do you rank results by relevance (e.g., TF-IDF) instead of returning an unordered set? 4. How would you handle document updates — what must be re-indexed when a document's text changes?
## Problem Merge K sorted lists or arrays into a single sorted output using a min-heap for efficiency. ## Likely LeetCode equivalent LeetCode 23 - Merge k Sorted Lists. ## Tags heap,linked_list,sorting,swe
MongoDB SWE Phone - Linked Hash Map
## Problem Implement a linked hash map that maintains insertion order while supporting O(1) get/put operations, similar to Java's LinkedHashMap. ## Likely LeetCode equivalent LeetCode 146 - LRU Cache (uses same doubly-linked-list + hash map structure). ## Tags hash_table,linked_list,design,swe
MongoDB SWE Phone - Lowest Common Ancestor
## Problem Find the lowest common ancestor of two nodes in a binary tree or binary search tree. ## Likely LeetCode equivalent LeetCode 236 - Lowest Common Ancestor of a Binary Tree. ## Tags binary_tree,recursion,dfs,swe
MongoDB SWE Phone - LRU Cache
## Problem Design and implement a Least Recently Used (LRU) cache supporting O(1) get and put operations using a hash map and doubly linked list. ## Likely LeetCode equivalent LeetCode 146 - LRU Cache. ## Tags hash_table,linked_list,design,swe
## Problem Resolve a filesystem path by simplifying '..' and '.' components, equivalent to the Unix path canonicalization problem. ## Likely LeetCode equivalent LeetCode 71 - Simplify Path. ## Tags strings,stack,swe
## Round 1 - Coding / Concurrency ## Problem Implement a read-write lock that allows multiple concurrent readers but ensures exclusive access for writers. Use Python's threading primitives. ```python import threading class ReadWriteLock: def __init__(self): ... def acquire_read(self) -> None: ... def release_read(self) -> None: ... def acquire_write(self) -> None: ... def release_write(self) -> None: ... ``` ## Example ``` lock = ReadWriteLock() # Thread A and B both call acquire_read() -> both proceed concurrently # Thread C calls acquire_write() -> blocks until A and B release # Thread D calls acquire_read() while C is waiting -> should block (writer preference) # Correct behavior: # Multiple readers can hold lock simultaneously # Only one writer at a time # No reader+writer simultaneously ``` ## Follow-ups 1. What is writer starvation and how do you prevent it in your implementation? 2. How would you implement this as a context manager so it works with Python's `with` statement? 3. What happens if a thread that holds a read lock calls `acquire_read` again (reentrant reads)? 4. How does your implementation differ from Python's built-in `threading.Lock`? When would you prefer a `ReadWriteLock`?
## Round 1 - Coding ## Problem Implement a retry decorator/wrapper that retries a failing function call up to `max_retries` times with configurable backoff strategies. Support fixed delay, linear backoff, and exponential backoff with optional jitter. ```python import time from typing import Callable, Type class RetryStrategy: def __init__(self, max_retries: int, backoff: str = "exponential", base_delay: float = 1.0, jitter: bool = False, retryable_exceptions: tuple = (Exception,)): ... def execute(self, func: Callable, *args, **kwargs): ... ``` ## Example ``` def flaky_api_call(): import random if random.random() < 0.7: raise ConnectionError("timeout") return "success" strategy = RetryStrategy(max_retries=3, backoff="exponential", base_delay=1.0) result = strategy.execute(flaky_api_call) # Attempt 1: fails -> wait 1s # Attempt 2: fails -> wait 2s # Attempt 3: fails -> wait 4s # Attempt 4: succeeds -> returns "success" # If all 4 fail -> raises the last exception ``` ## Follow-ups 1. Why add jitter to exponential backoff? What problem does it solve in distributed systems? 2. How do you distinguish retryable errors (network timeout) from non-retryable ones (HTTP 400 bad request)? 3. How would you implement a circuit breaker on top of this retry strategy? 4. How would you make this work with `async`/`await` for async functions?
## Problem Find specific scores from a dataset, such as top-K scores or scores within a range, likely using sorting or a heap. ## Likely LeetCode equivalent LeetCode 215 - Kth Largest Element in an Array. ## Tags sorting,heap,arrays,swe
## Problem Find the K smallest numbers from a large data stream or array efficiently using a max-heap. ## Likely LeetCode equivalent LeetCode 703 - Kth Largest Element in a Stream. ## Tags heap,arrays,sorting,swe
MongoDB SWE Phone - SnapID
## Problem Design a snapshot ID system for versioned data, enabling point-in-time reads of key-value pairs at a given snapshot version. ## Likely LeetCode equivalent LeetCode 1146 - Snapshot Array. ## Tags design,arrays,binary_search,swe
## Problem Schedule tasks with cooldown constraints, minimizing total time needed to execute all tasks while respecting CPU idle intervals. ## Likely LeetCode equivalent LeetCode 621 - Task Scheduler. ## Tags heap,greedy,sorting,swe
## Round 1 - Coding ## Problem Given a log of user login events `(user_id, timestamp, ip_address)`, detect suspicious activity: a user logging in from more than `K` distinct IP addresses within any rolling window of `W` minutes. ```python from datetime import datetime def detect_suspicious_logins( events: list[tuple[str, datetime, str]], max_ips: int, window_minutes: int ) -> list[str]: # returns list of user_ids flagged as suspicious ... ``` ## Example ``` events = [ ("u1", datetime(2024,1,1,10, 0), "1.1.1.1"), ("u1", datetime(2024,1,1,10,10), "2.2.2.2"), ("u1", datetime(2024,1,1,10,20), "3.3.3.3"), ("u1", datetime(2024,1,1,10,25), "4.4.4.4"), ("u2", datetime(2024,1,1,10, 0), "5.5.5.5"), ] detect_suspicious_logins(events, max_ips=3, window_minutes=30) # u1 has 4 distinct IPs in 25 minutes -> flagged # u2 has 1 IP -> not flagged # -> ["u1"] ``` ## Follow-ups 1. How do you efficiently maintain the rolling window as new events arrive? 2. How would you handle events that arrive out of order by timestamp? 3. What additional signals (e.g., geolocation jumps, failed login ratio) would strengthen the detection? 4. How would you store and query these events in a database for historical analysis?
See All 19 Questions from This Round
Full question text, answer context, and frequency data for subscribers.
Get Access