course-schedule
Difficulty: Medium · Topic: Graphs · Practice it on parikshan: /practice/course-schedule/start
The story
The registrar of a large university hands you a list of courses and their prerequisites. The dean wants a one-line answer before the next senate meeting: can a student, in principle, complete every course offered? Or has someone, somewhere in the catalogue, accidentally written a rule that says course A requires B which requires A?
That question is a cycle-detection query on a directed graph. The same algorithm runs when npm checks for dependency cycles before publishing a package, when Bazel and Gradle decide whether a build graph is well-formed, when Make refuses to run because target X depends on target Y which depends on X, and when AWS CloudFormation rejects a stack because two resources reference each other circularly. It is, alongside topological sort, the most-shipped graph algorithm in working software.
What's actually being asked
n courses labelled 0..n-1. m rules of the form u v meaning "course u requires course v". Print true if there exists an order in which every course can be completed, false if the prerequisites form a cycle. Self-loops count as cycles; parallel edges do not.
The naive approach
Try every permutation of courses, check whether each is a valid order. O(n!). For n = 20 that is 2.4 quintillion. Useless. Graph traversal solves it in O(n + m).
The key insight
Every prerequisite rule u v is a directed edge from v (prerequisite) to u (dependent). A valid completion order exists if and only if this directed graph has no cycle, i.e., it is a DAG. So the question reduces to directed cycle detection.
Two canonical algorithms solve this in O(n + m). Both belong in your fingers; you will use them in your career more times than every other graph algorithm except plain BFS.
Kahn's algorithm (BFS-style, in-degree counting)
Compute the in-degree of every node. Seed a queue with every node of in-degree 0 (a "source"). Repeatedly: dequeue a node, count it as "taken", decrement the in-degree of each successor, enqueue successors that just dropped to in-degree 0. When the queue empties, you have taken some prefix of the nodes. If the taken count equals n, the graph is a DAG and the order in which you popped is a valid topological sort. If the taken count is less than n, the un-taken nodes form a sub-graph where every node has at least one un-taken predecessor, a cycle.
Three-colour DFS
Mark each node white (unseen), grey (currently on the DFS stack), or black (finished). DFS from each white node: paint grey on entry, recurse on every neighbour, paint black on exit. If during recursion you encounter a grey neighbour, that is a back edge that closes a cycle, return immediately. A black neighbour is a forward or cross edge and is safe. White neighbours recurse.
Both run in O(n + m). Kahn is iterative and easy to write under time pressure; three-colour DFS is the foundation of every more sophisticated graph algorithm (Tarjan's SCC, dominator trees, articulation points). Master both.
Step-by-step approach
- Read
n,m. Initialiseindeg = [0] * n,adj = [[] for _ in range(n)]. - For each rule
u v:adj[v].append(u),indeg[u] += 1. - Initialise a deque with every
iwhereindeg[i] == 0. seen = 0.- While the deque is non-empty:
- Pop
u.seen += 1. - For each
winadj[u]:indeg[w] -= 1. Ifindeg[w] == 0, appendw.
- Pop
- Print
'true'ifseen == n, else'false'.
Step-by-step approach (three-colour DFS, iterative)
color = [0] * nwith 0=white, 1=grey, 2=black.- For each
vin0..n-1withcolor[v] == 0:- Push
(v, iter(adj[v]))onto a manual stack, setcolor[v] = 1. - While the stack is non-empty:
- Peek the top frame
(u, it). Try to advance the iterator. - If next neighbour
w: ifcolor[w] == 1return'true'(cycle); ifcolor[w] == 0, push(w, iter(adj[w])), setcolor[w] = 1; ifcolor[w] == 2, skip. - If iterator exhausted: set
color[u] = 2, pop the frame.
- Peek the top frame
- Push
- If no cycle ever fired, print
'false'(the'true'/'false'mapping is "is there a cycle?" inverted to "can the student finish?"; the problem asks the latter, so flip).
Note the flip: the algorithms detect cycles (truthy means a cycle); the problem returns 'true' for "no cycle". Match the spec at the print site.
Worked example
n = 4, m = 4
1 0
2 0
3 1
3 2
Adjacency from prerequisite to dependent: adj = [[1,2], [3], [3], []]. In-degrees [0, 1, 1, 2].
- Initial queue:
[0]. - Pop
0.seen = 1. Decrementindeg[1] -> 0,indeg[2] -> 0. Queue:[1, 2]. - Pop
1.seen = 2. Decrementindeg[3] -> 1. Queue:[2]. - Pop
2.seen = 3. Decrementindeg[3] -> 0. Queue:[3]. - Pop
3.seen = 4. Queue empty. seen == n, print'true'.
A cycle example would be 1 0 and 0 1. Initial in-degrees both 1. The queue starts empty. The loop terminates immediately with seen = 0 < 2. Print 'false'.
Complexity
- Time: O(n + m).
- Space: O(n + m) for the adjacency, O(n) for the in-degree array and the queue.
Common pitfalls
- Edge direction confusion: "course
urequires coursev" parses to an edgev -> u, notu -> v. Get this backwards and the in-degree array is the in-degree of the wrong nodes. - Self-loops counted as DAGs: a self-loop
u uadds 1 toindeg[u]with no offsetting decrement, sounever enters the queue, and Kahn's correctly returns'false'. The three-colour DFS seesugrey then sees itself grey on the first neighbour step. - Parallel edges counted as cycles: two copies of
u vadd 2 toindeg[u]and 2 entries toadj[v]. Both decrement whenvis popped. Output is unchanged. - Recursive DFS on a 1000-node chain: Python's default recursion depth is 1000. Iterative DFS or
sys.setrecursionlimit(...)is required. - Forgetting to scan every node (DFS path): a single DFS from node 0 misses disconnected components. The outer scan over
0..n-1covers them.
Where this shows up in the enterprise
- Bazel and Gradle build DAGs: dependency cycle detection on build targets is exactly this algorithm. A failed build with "cycle detected: A -> B -> A" is Kahn's or three-colour DFS.
- npm and pip dependency cycles: package managers refuse to install when the resolution graph has a cycle.
- AWS CloudFormation dependency: stack creation aborts when resource A
DependsOnB whichDependsOnA. - Spreadsheets (Excel, Google Sheets): cell A1 referencing B1 referencing A1 produces a "circular reference" error, which is this algorithm running on the formula DAG.
- Datadog service maps: detecting a service-call cycle at design time prevents production deadlocks.
- kubernetes Pod scheduling: detecting an initContainer dependency cycle before scheduling.
- Make: refuses to build when
target.odepends ontarget.cwhich depends ontarget.o. - Database migrations: a forward-only schema-migration framework runs Kahn's on the migration DAG to compute the order. Cycle = corrupt migration history.
- University catalogue tools (Banner, PeopleSoft): the literal university registrar story from the top.
The pattern "dependency manager refusing a circular plan" is one of the most-shipped pieces of code on the planet.
Variants you'll see elsewhere
course-schedule-ii: not just "can it be done" but "in what order"; Kahn returns the order; on a cycle, return empty.alien-dictionary: derive the prerequisite edges from string comparisons, then run topological sort.parallel-courses: same DAG, return the number of semesters if you can take any number of in-degree-0 courses in parallel.detect-cycle(directed graph, standalone): the cycle-detection half of this problem alone; either Kahn or three-colour DFS.find-eventual-safe-states: reverse the graph, return all nodes not on a cycle and not reaching one.
All of them: directed graph, in-degree counting or three-colour DFS, the same core in 30 lines of Python.
How parikshan turns this into a learning loop
This is the cluster's first "two valid algorithms, pick one and master both" problem. Recommended flow:
- Practice mode, no AI: write Kahn's from scratch. Target: under 15 minutes.
- Practice mode again: write three-colour DFS from scratch, iteratively. Target: under 20 minutes (recursion-to-iteration is the hard part).
- AI training: Explain the equivalence. The agent should articulate why a topological order exists iff the graph is a DAG iff no back edge exists in DFS.
- Exam mode: solve under 10 minutes with your preferred algorithm.
The agent's explanation of the equivalence is the high-leverage learning moment. The mechanics of either algorithm are 30 lines; the conceptual link "no cycle = in-degree-0 chain exists = topological order exists = no back edge in DFS" is the part that compounds across every graph problem you ever touch.
In the AI-integrated workspace
When you direct an agent to detect dependency cycles, the failure modes are predictable:
- The agent reverses edge direction: "course
urequires coursev" is interpreted asu -> vinstead ofv -> u. The cycle answer is unaffected (a cycle inGis a cycle inG^reverse) but topological sort produces the wrong order. Audit the parsing step. - The agent uses recursive DFS on inputs the spec allows but the agent's tests do not exercise: passes the public tests, crashes on production input.
- The agent forgets the outer scan over disconnected components: a single DFS from node 0 returns "DAG" on a graph whose only cycle is in a disconnected component.
- The agent uses a
visitedset with two states instead of three colours: cannot detect back edges; only detects whether a node was visited at all. The DFS happily walks a cycle and returns "DAG".
A reviewer who can audit edge direction in five seconds and demand three colours (or in-degree counting) for cycle detection is the reviewer whose builds, packages, and deploys do not silently corrupt. parikshan's drill is to make these audits automatic. The cycle-detection skill returns on investment every time a CI build refuses a circular import; you will have written those refusals correctly by then.