1p3a_oj Question

In-memory KV Cache with Hit Count and Unit Tests

Question Details

Question: Implement an In-memory KV Cache with Hit Count and Unit Tests

Implement an in-memory key-value cache that supports get/put and maintains hit count statistics.

Requirements

Full Details

Question: Implement an In-memory KV Cache with Hit Count and Unit Tests

Implement an in-memory key-value cache that supports get/put and maintains hit count statistics.

Requirements

Design a class KVCache:

  • put(key: str, value: str) -> None
  • If key exists, update its value.
  • If the cache is full, evict one key according to an eviction policy you choose (e.g., LRU).
  • get(key: str) -> Optional[str]
  • If key exists,

return its value and increment hit_count[key] += 1.
- Otherwise return None.
- hit_count(key: str) -> int
-

Return how many times this key has been hit by get (misses do not count). If the key does not exist / never existed,

return 0.

Constraints

  • Initialize with a positive integer capacity; eviction is required when full.
  • Keys and values are strings.
  • No persistence / no distribution required.
  • State your target time complexity (e.g., amortized O(1) for both ops).

Unit Tests

Write unit tests that cover at least:

  1. Basic put/get behavior.
  2. Hit count increments on hits and does not change on misses.
  3. Behavior under eviction: evicted keys return None; hit count matches your definition.
  4. Updating an existing key with repeated put.

Scale

  • Number of operations N: 1 <= N <= 200000
  • capacity: 1 <= capacity <= 100000

I/O (online-judge style)

Input:

  • First line: capacity Q
  • Next Q lines:
  • PUT key value
  • GET key
  • HIT key

Output:

  • For each GET: print the value on hit, otherwise NULL
  • For each HIT: print the integer hit count

Sample Input

2 9
PUT a 1
PUT b 2
GET a
HIT a
GET c
HIT c
PUT c 3
GET b
GET c

Sample Output

1
1
NULL
0
NULL
3

Test Cases

Case 1

Input:

2 9
PUT a 1
PUT b 2
GET a
HIT a
GET c
HIT c
PUT c 3
GET b
GET c

Output:

1
1
NULL
0
NULL
3

Case 2

Input:

1 6
PUT a 1
GET a
PUT b 2
GET a
GET b
HIT b

Output:

1
NULL
2
1

Case 3

Input:

2 8
PUT a 1
PUT a 9
GET a
HIT a
PUT b 2
PUT c 3
GET a
GET b

Output:

9
1
9
NULL

Case 4

Input:

3 7
GET x
HIT x
PUT x 7
HIT x
GET x
HIT x
GET y

Output:

NULL
0
0
7
1
NULL

Case 5

Input:

2 10
PUT a 1
PUT b 2
GET a
GET a
HIT a
PUT c 3
GET b
HIT b
GET c
HIT c

Output:

1
1
2
NULL
0
3
1
Free preview — 6 questions shown. Unlock all Databricks questions →

About This Question

This is a reported interview question from a databricks interview for a swe role during the coding round.

It covers the following topics: Strings, Probability Stats .

Difficulty rating: Easy

About Databricks Interview Reports

This question was reported by a candidate who interviewed at Databricks. LeakCode aggregates interview reports from 10+ sources, including 1Point3Acres, Glassdoor, LeetCode Discuss, Blind, Reddit, Indeed, and Nowcoder. Each report is translated where necessary, deduplicated against existing entries, and tagged by company, role, round type, and reporting date.

Use this question as one calibration data point, not a memorization target. Companies typically rotate their question pools every 2-4 months; the exact wording of a 2024 question may differ from what you encounter today. The underlying pattern, difficulty level, and follow-up depth at Databricks are the higher-signal extractions to take from this report.

For broader preparation context, the Databricks interview process typically includes a recruiter screen, one or two technical phone screens, and a 4-5 round on-site loop covering coding, system design (at L4+ levels), and behavioral. Reports tagged on LeakCode show the round-by-round distribution and typical difficulty calibration. To browse questions filtered by round type and seniority, use the company hub linked above.

How To Practice This Type of Question

Solve similar problems on LeetCode under timed conditions (25-35 minutes per medium difficulty). The goal is pattern recognition: recognize the underlying technique (sliding window, two-pointer, BFS, memoized recursion, etc.) within 60-90 seconds of reading. Strong candidates verbalize their hypothesis out loud before coding, then iterate based on feedback. Weak candidates dive into implementation immediately, lose time on the wrong approach, and run out of time for follow-ups.

Companies update their question pools every 2-4 months. The exact wording of any given question may have been retired by the time you interview. Focus your prep on the pattern, not the specific problem. The patterns that appear in Databricks reports consistently are the ones worth investing in; one-off niche problems are not.

During Your Databricks Round

Apply the standard interview round template: clarify requirements (2-3 minutes), state your approach out loud and confirm direction with the interviewer (3-5 minutes), code with narration (15-25 minutes), test with concrete examples including edge cases (5 minutes), discuss optimization or trade-offs if time permits (5 minutes). This template is universally accepted across FAANG and adjacent companies; deviating from it produces weaker interviewer feedback signal.

The single most predictive failure mode in Databricks reports tagged "no hire": not asking clarifying questions. Interviewers are explicitly trained to weight this. Strong candidates ask 3-5 clarifying questions even on problems that look obvious; weak candidates dive into code immediately. The clarifying-question check is often the first signal recorded in the interviewer's written notes.