InterviewDB Experience

Text Ingestion Pipeline - System Design for Scalable Document Ingestion and Indexing

Interview Experience

Round 1 System Design

Problem

Design a text ingestion pipeline that accepts large volumes of raw documents (PDFs, HTML, plain text) from multiple upstream sources, processes them, and makes them searchable within 60 seconds of arrival. The system must handle 100K documents/day and support full-text search with relevance ranking.

Requirements

  • Ingest from: REST upload, S3 event trigger, webhook.
  • Processing: extract text, detect language, chunk into passages, embed for semantic search.
  • Storage: raw store (S3), metadata (Postgres), vector index (e.g. Pinecone or pgvector).
  • SLA: p95 end-to-end ingest latency < 60s.

Design Sketch

Upload API -> Queue (SQS/Kafka) -> Worker Pool
  Worker: extract text -> chunk -> embed -> write metadata + vectors
  DLQ for failed docs -> alerting

Discussion Points

  • How do you handle retries without double-indexing a document?
  • How do you manage embedding model versioning — re-embedding 10M docs after a model upgrade?
  • Schema for the metadata table: what columns, what indexes?

Follow-ups

  1. How does your design change if documents can be updated or deleted after ingestion?
  2. How do you handle a 5 GB PDF that exceeds Lambda/worker memory limits?
  3. How do you monitor ingestion lag and alert when the queue backs up?
  4. How would you prioritize VIP customer uploads over standard ones?

Full Details

Round 1 System Design

Problem

Design a text ingestion pipeline that accepts large volumes of raw documents (PDFs, HTML, plain text) from multiple upstream sources, processes them, and makes them searchable within 60 seconds of arrival. The system must handle 100K documents/day and support full-text search with relevance ranking.

Requirements

  • Ingest from: REST upload, S3 event trigger, webhook.
  • Processing: extract text, detect language, chunk into passages, embed for semantic search.
  • Storage: raw store (S3), metadata (Postgres), vector index (e.g. Pinecone or pgvector).
  • SLA: p95 end-to-end ingest latency < 60s.

Design Sketch

Upload API -> Queue (SQS/Kafka) -> Worker Pool
  Worker: extract text -> chunk -> embed -> write metadata + vectors
  DLQ for failed docs -> alerting

Discussion Points

  • How do you handle retries without double-indexing a document?
  • How do you manage embedding model versioning — re-embedding 10M docs after a model upgrade?
  • Schema for the metadata table: what columns, what indexes?

Follow-ups

  1. How does your design change if documents can be updated or deleted after ingestion?
  2. How do you handle a 5 GB PDF that exceeds Lambda/worker memory limits?
  3. How do you monitor ingestion lag and alert when the queue backs up?
  4. How would you prioritize VIP customer uploads over standard ones?
Free preview — 6 questions shown. Unlock all Temporal questions →

About This Question

This is a candidate experience report from a temporal interview during the phone round.

It covers the following topics: Phone, System Design, System Design, Queue, Onsite .

About Temporal Interview Reports

This question was reported by a candidate who interviewed at Temporal. LeakCode aggregates interview reports from 10+ sources, including 1Point3Acres, Glassdoor, LeetCode Discuss, Blind, Reddit, Indeed, and Nowcoder. Each report is translated where necessary, deduplicated against existing entries, and tagged by company, role, round type, and reporting date.

Use this question as one calibration data point, not a memorization target. Companies typically rotate their question pools every 2-4 months; the exact wording of a 2024 question may differ from what you encounter today. The underlying pattern, difficulty level, and follow-up depth at Temporal are the higher-signal extractions to take from this report.

For broader preparation context, the Temporal interview process typically includes a recruiter screen, one or two technical phone screens, and a 4-5 round on-site loop covering coding, system design (at L4+ levels), and behavioral. Reports tagged on LeakCode show the round-by-round distribution and typical difficulty calibration. To browse questions filtered by round type and seniority, use the company hub linked above.

How To Practice This Type of Question

Solve similar problems on LeetCode under timed conditions (25-35 minutes per medium difficulty). The goal is pattern recognition: recognize the underlying technique (sliding window, two-pointer, BFS, memoized recursion, etc.) within 60-90 seconds of reading. Strong candidates verbalize their hypothesis out loud before coding, then iterate based on feedback. Weak candidates dive into implementation immediately, lose time on the wrong approach, and run out of time for follow-ups.

Companies update their question pools every 2-4 months. The exact wording of any given question may have been retired by the time you interview. Focus your prep on the pattern, not the specific problem. The patterns that appear in Temporal reports consistently are the ones worth investing in; one-off niche problems are not.

During Your Temporal Round

Apply the standard interview round template: clarify requirements (2-3 minutes), state your approach out loud and confirm direction with the interviewer (3-5 minutes), code with narration (15-25 minutes), test with concrete examples including edge cases (5 minutes), discuss optimization or trade-offs if time permits (5 minutes). This template is universally accepted across FAANG and adjacent companies; deviating from it produces weaker interviewer feedback signal.

The single most predictive failure mode in Temporal reports tagged "no hire": not asking clarifying questions. Interviewers are explicitly trained to weight this. Strong candidates ask 3-5 clarifying questions even on problems that look obvious; weak candidates dive into code immediately. The clarifying-question check is often the first signal recorded in the interviewer's written notes.