Databricks Software Engineer Phone Screen Questions
30+ questions from real Databricks Software Engineer Phone Screen rounds, reported by candidates who interviewed there.
What does the Databricks Phone Screen round test?
The Databricks phone screen typically lasts 45-60 minutes and evaluates core Software Engineer fundamentals. Candidates should expect 1-2 algorithmic problems, basic system design discussion at senior levels, and questions about relevant experience. The goal is to confirm technical competence before bringing candidates onsite.
Top Topics in This Round
Databricks Software Engineer Phone Screen Questions
Databricks Software Engineer Tech Phone Screen Interview Experience
The first round interview was a very short self-introduction, but the experience was good. The interviewer provided various clarifications, which were very clear. The question was to implement an OOD
Databricks Fulltime SDE Tech Phone Screen: Intermediate Coding and Array Interval Questions
Round 1: Coding Design Implement a simplified JSON parser capable of parsing custom-formatted log files or handling strings containing escape characters. The problem itself wasn't difficult. During th
You are comm
Databricks Tech Phone Screen: Anagram Index Coding Challenge
The entire phone interview lasted 60 minutes. There was a 15-minute chat before and after the coding challenge, leaving 45 minutes for coding. One problem was very concise, described in a single line:
Freshly baked interview experience! The question was about finding a path using a Fibnaci tree, a very classic question, testing how to calculate the time complexity of recursion. After constructing t
Databricks Tech Phone Screen: CIDR and IP Address Matching Interview
The question was about CIDR <-> IP address. Although I saw some interview experiences on the forum, the descriptions were so vague that even ChatGPT couldn't explain the specific questions or how to a
Databricks | Technical Phone Screen | San Francisco
You are given n randomly generated connected graphs. You need to merge these n graphs into a single graph by randomly picking one graph and connecting with the another. Each...
Databricks | Technical Phone screen | BFS
Got a graph based question to find the shortest and cheapest route from source to destination. Gave the correct solution and interviewer seemed happy with the solution. Got an email (automated,...
Databricks | SWE | Bengaluru | Nov 2024 [Offer]
Company: Databricks Status: Selected Qualification: BTech in CSE from IIITH (2025 grad) Position: Software Engineer Location: Bengaluru Interview Process (On-campus hiring): Resume Shortlisting \u2192 Two Technical Rounds \u2192 Hiring Manager Round Round 1: Technical Interview Problem Statement: Given...
Databricks Tech Phonescreen L4 Experience
Had phone screen earlier this week. Was asked ip to cidr question seen here: https://leetcode.com/discuss/interview-question/5743277/Databricks-L5-(SSE)-Technical-Phone-Screen-or-Cleared/ Seems they like asking this question a lot in screening regardless of level (L4/L5). Interviewer was...
Anyone have experience with the Databricks New Grad interview process?
passed the phone screen so i now have 4 rounds scheduled in 2 weeks. the rounds are apparently gonna be Algos / DSA, Coding + Debugging, System Design, and Behavioral. does anyone know what to expect
DataBricks Online Virtual Rounds | Rejected
I recently had given DataBricks interview but unfortunately couldn\'t make it but just wanted to help the communiy so writting my experience . HR reached out to me for one...
Databricks Phone Screen
Bad interview experience (and reject). I would say I wasn\'t prepared enough for the interview since I hadn\'t seen the ip to cidr question and just understanding what cidr is...
Databricks India Interview Experience (L5) Rejected
YOE: 6 years. I recently had an opportunity to interview at databricks. There were a total of 5 rounds scheduled as follows. 1. 2 Coding round. (It was expected to write fully...
## Round 1 - System Design ## Problem Design an online bookstore serving 1 million daily active users. The system must support: - **Catalog search**: full-text search by title, author, ISBN, genre. Results ranked by relevance and rating. - **Inventory management**: each book has a stock count. Prevent overselling. - **Cart and checkout**: users can add books, apply promo codes, and complete purchase. - **Order history**: users can view past orders and download receipts. Walk through your high-level architecture, key data models, and the trickiest consistency problem in this system. ``` Key entities: Book(id, title, author, isbn, price, stock_count) User(id, email, address) Cart(user_id, items: [{book_id, qty}]) Order(id, user_id, items, total, status, created_at) ``` ## Follow-ups 1. Two users add the last copy of a book to their carts simultaneously. How do you handle checkout to prevent overselling? Walk through your locking or reservation strategy. 2. Search must return results in under 100ms. Where does Elasticsearch fit, and how do you keep it in sync with your primary DB? 3. How would you implement "frequently bought together" recommendations without a full ML pipeline? 4. Design the promo code system: each code can be single-use, limited-count, or percentage-based. Where is validation enforced?
## Round 1 - Coding / System Design ## Problem Design and partially implement a REST API for managing customer revenue data. The system tracks `Account` and `RevenueRecord` entities: ``` Account: {id, name, tier: ["free"|"pro"|"enterprise"], created_at} RevenueRecord: {id, account_id, amount, currency, period_start, period_end} ``` Required endpoints: - `POST /accounts` — create account - `GET /accounts/{id}/revenue` — return all revenue records, with optional `?period=YYYY-MM` filter - `GET /accounts/{id}/revenue/summary` — return total revenue, avg per period, and MoM growth - `DELETE /accounts/{id}` — soft-delete account and all associated records ```python # Pseudo-schema class AccountRouter: def create_account(self, name: str, tier: str) -> Account: ... def get_revenue(self, account_id: str, period: str | None) -> list[RevenueRecord]: ... def get_summary(self, account_id: str) -> dict: ... def delete_account(self, account_id: str) -> None: ... ``` ## Follow-ups 1. How do you implement MoM growth when some months have no revenue? What do you return for a null denominator? 2. Describe your DB schema, including indexes. Which columns need indexes for the period filter to be efficient? 3. The `DELETE` is soft-delete. How do all `GET` endpoints need to change to respect deleted records? 4. A finance team needs to export 5 years of revenue for all enterprise accounts as CSV. How does your API handle bulk export without timing out?
## Problem Implement a two-pass encoder/decoder. Given a string, first apply run-length encoding (consecutive repeated chars become `char+count`), then apply a substitution cipher where each encoded character is shifted by its 1-based position in the output string (mod 26, letters only, leave digits unchanged). ```python def encode(s: str) -> str: ... def decode(s: str) -> str: ... ``` ``` Encode steps for "aaabbc": Step 1 (RLE): "a3b2c1" Step 2 (shift): position 1='a' shift 1->'b', digit '3' unchanged, position 3='b' shift 3->'e', digit '2' unchanged, position 5='c' shift 5->'h', digit '1' unchanged Final: "b3e2h1" Decode("b3e2h1") -> "aaabbc" ``` ## Follow-ups 1. Your decoder must invert both passes in reverse order. Walk through the decode algorithm step by step. 2. What inputs break your RLE step? (Hint: digits in the original string.) 3. How would you make the encoder streamable — processing characters without loading the full string into memory? 4. If you needed to transmit the encoded string over a network with a 100-byte MTU, how would you chunk and reassemble it safely?
## Problem Simulate a block falling through a grid, tracking its position as it drops and possibly lands on obstacles. ## Tags matrix, arrays
## Problem Process a set of intervals to merge overlapping ranges or find coverage gaps. ## Tags arrays, sorting
## Problem Implement a `LazyArray` that supports bulk range updates efficiently. All values start at 0. - `update(l, r, val)` — add `val` to every element in index range `[l, r]` inclusive. - `query(i)` — return the current value at index `i`. - `range_query(l, r)` — return the sum of elements in `[l, r]`. Naive O(n) updates are not acceptable. Use lazy propagation. ```python class LazyArray: def __init__(self, n: int): ... def update(self, l: int, r: int, val: int) -> None: ... def query(self, i: int) -> int: ... def range_query(self, l: int, r: int) -> int: ... ``` ``` n = 5, all zeros: [0, 0, 0, 0, 0] update(1, 3, 4) -> [0, 4, 4, 4, 0] update(2, 4, 2) -> [0, 4, 6, 6, 2] query(2) -> 6 range_query(1,4)-> 18 ``` ## Follow-ups 1. What is the time complexity of your `update` and `range_query` with a segment tree + lazy propagation? 2. If updates are multiplicative instead of additive, how does the lazy tag change? 3. Can you implement this with a Fenwick tree (BIT) instead? What operation would you need two BITs for? 4. How does your implementation behave with negative `val` updates? Walk through an example.
See All 30 Questions from This Round
Full question text, answer context, and frequency data for subscribers.
Get Access