13 February 2026
I still remember the first time I wrote suspend fun and felt like something magical was happening. The function could pause, go do something else, and come back to exactly where it left off. No callbacks, no Rx chains, no Handler.postDelayed. It just worked.
But âit just worksâ is a dangerous place to stay. When I started debugging coroutine stack traces that made no sense, or when a delay() resumed my coroutine on a completely different thread than where it started, I realized I had no mental model for what was actually happening underneath. And without that model, I was writing coroutines by guesswork.
So I went into the bytecode. And what I found changed how I think about coroutines entirely: your suspend function is not a function. Itâs a class. A state machine, generated by the Kotlin compiler, with a label field and a when-expression that jumps between states. Every suspend call is a potential exit point, and every resume is a re-entry into that same state machine at the next label. Once you see this, coroutines stop being magic. They become a well-designed compiler trick that you can reason about, debug, and optimize.
Before coroutines, Android had a painful history with async code. AsyncTask, then RxJava, then callback hell. The core problem was always the same â you needed to break sequential logic into pieces that could run later, but you had to wire those pieces together manually.
Kotlin coroutines solve this with Continuation Passing Style (CPS). The idea is old â it comes from Scheme and functional programming â but the Kotlin compiler applies it automatically. Hereâs what CPS means in practice. When you write:
suspend fun fetchUser(userId: String): User {
val token = authenticate(userId) // suspension point #1
val user = loadProfile(token) // suspension point #2
return user
}
The compiler transforms this into something conceptually like:
fun fetchUser(userId: String, continuation: Continuation<User>): Any? {
val token = authenticate(userId, continuation)
if (token == COROUTINE_SUSPENDED) return COROUTINE_SUSPENDED
val user = loadProfile(token as Token, continuation)
if (user == COROUTINE_SUSPENDED) return COROUTINE_SUSPENDED
return user
}
Two things changed. First, an extra parameter was added: a Continuation<User> object. This is the callback â it knows how to resume the function when the suspended operation completes. Second, the return type changed from User to Any?. Thatâs because the function now returns either the actual User result or a special marker COROUTINE_SUSPENDED to signal that it paused. Kotlin doesnât have union types, so Any? is the only way to express âeither T or COROUTINE_SUSPENDED.â The Continuation interface itself is simple:
public interface Continuation<in T> {
public val context: CoroutineContext
public fun resumeWith(result: Result<T>)
}
It holds a CoroutineContext (which contains the dispatcher, job, exception handler) and a single resumeWith function. When the suspended operation finishes, someone calls resumeWith with the result, and the coroutine continues.
Hereâs where it gets interesting. The CPS transformation above was simplified. In reality, the compiler doesnât generate separate function calls with continuation threading. It generates a state machine â a single class with a label field that tracks where the coroutine paused. For our fetchUser function, the compiler generates something like this:
fun fetchUser(userId: String, completion: Continuation<User>): Any? {
// The continuation IS the state machine
val sm = completion as? FetchUserStateMachine
?: FetchUserStateMachine(completion)
when (sm.label) {
0 -> {
sm.label = 1
val result = authenticate(userId, sm)
if (result == COROUTINE_SUSPENDED) return COROUTINE_SUSPENDED
sm.result = result
// Fall through to label 1
}
1 -> {
sm.result.throwOnFailure()
val token = sm.result as Token
sm.label = 2
val result = loadProfile(token, sm)
if (result == COROUTINE_SUSPENDED) return COROUTINE_SUSPENDED
sm.result = result
// Fall through to label 2
}
2 -> {
sm.result.throwOnFailure()
return sm.result as User
}
else -> throw IllegalStateException("Invalid label")
}
}
The state machine class stores every local variable that needs to survive across suspension points. The label field is just an Int that gets incremented at each suspension point. When the coroutine resumes, resumeWith is called on the continuation, which re-enters the same fetchUser function but now jumps to the correct label.
This is the reframe: thereâs no thread parking, no fiber, no continuation object floating in memory waiting for a signal. Thereâs a class with fields, and a when-expression. Each suspend point is a potential exit, and each resume is a re-entry at the next label. Thatâs it. For N suspension points, the compiler generates N+1 states (0 through N). State 0 is the initial entry, and each subsequent state handles the result of the previous suspension.
Now you understand why coroutine stack traces look weird. When a coroutine suspends, the actual call stack unwinds completely. The state machine saves local variables into its fields, returns COROUTINE_SUSPENDED up the chain, and the thread is free.
When it resumes, a new call stack is created starting from the dispatcher. The state machine re-enters at the correct label, but the original call stack is gone. This is why you see frames like invokeSuspend and BaseContinuationImpl.resumeWith instead of your actual function hierarchy. Kotlin tried to address this with the -Xdebug compiler flag and the kotlinx-coroutines-debug module, which stitches together the logical call stack by tracking continuation chains. But the point is â understanding the state machine explains why debugging coroutines requires different tools than debugging threads.
When your coroutine resumes, how does it end up on the right thread? This is where ContinuationInterceptor comes in. Itâs a CoroutineContext.Element that wraps every continuation:
interface ContinuationInterceptor : CoroutineContext.Element {
fun <T> interceptContinuation(
continuation: Continuation<T>
): Continuation<T>
}
Every time a coroutine is about to resume, the runtime checks the context for a ContinuationInterceptor. If one exists, it wraps the continuation in a DispatchedContinuation that redirects resumeWith calls through the dispatcher. Dispatchers.Main uses Androidâs Handler to post the resume to the main threadâs Looper. Dispatchers.IO uses a shared thread pool limited to 64 threads by default. Dispatchers.Default uses a thread pool sized to the number of CPU cores (with a minimum of two).
Hereâs whatâs subtle: the interception happens per-resume, not per-launch. If your coroutine suspends in one dispatcher context and resumes in another (because you called withContext), the interceptor at the resume site determines the thread. This is why withContext(Dispatchers.IO) actually works â it replaces the interceptor in the context, so when the inner blockâs continuation resumes, it dispatches to the IO pool. The withContext call doesnât create a new coroutine. It creates a new context with a different dispatcher, suspends the current coroutine, and re-dispatches the continuation.
suspend fun fetchAndParse(): ParsedData {
// Running on Dispatchers.Main (from viewModelScope)
val raw = withContext(Dispatchers.IO) {
// Continuation intercepted â dispatched to IO pool
api.fetchRawData()
}
// Back on Dispatchers.Main â interceptor changed back
return parse(raw)
}
Structured concurrency isnât just a nice API design â itâs built into the Job hierarchy. When you call launch or async inside a CoroutineScope, the new coroutineâs Job becomes a child of the scopeâs Job. This parent-child relationship enforces three rules:
cancel() on every child Job.SupervisorJob).This means viewModelScope.launch { ... } automatically cancels when the ViewModel clears, because the scopeâs Job is cancelled, which cascades to every child coroutine. The SupervisorJob breaks rule 3: child failures donât propagate upward. This is why viewModelScope uses SupervisorJob + Dispatchers.Main.immediate â you donât want one failing network call to cancel all other coroutines in the ViewModel.
One thing that tripped me up early: cancelling a coroutine doesnât forcefully stop it. It sets a flag on the Job (isActive = false) and throws CancellationException at the next suspension point. If your coroutine is doing CPU-heavy work without any suspension points, it wonât respond to cancellation:
// This will NOT be cancelled
viewModelScope.launch {
var sum = 0L
for (i in 1..1_000_000_000) {
sum += i // No suspension point â cancellation can't interrupt
}
}
You need to explicitly check for cancellation in tight loops:
viewModelScope.launch {
var sum = 0L
for (i in 1..1_000_000_000) {
ensureActive() // Throws CancellationException if cancelled
sum += i
}
}
ensureActive() is preferred over checking isActive because it throws immediately rather than requiring you to handle the exit yourself. yield() is another option â it checks for cancellation and also gives other coroutines a chance to run on the same dispatcher.
Coroutines are lightweight, but theyâre not free. Each coroutine creates a state machine class (allocated on the heap), and each suspension point stores local variables in that object. For a function with 3 suspension points and 5 local variables, thatâs a class with at least 8 fields plus the label. In practice, this is negligible for most apps. The overhead of creating a coroutine is roughly comparable to creating a small object â orders of magnitude cheaper than creating a thread (which allocates a ~1MB stack by default on the JVM).
But there are real tradeoffs. The generated state machine classes increase your method count and APK size slightly. R8/ProGuard can optimize some of this away, but heavily coroutine-based code does produce more classes than equivalent callback code. In one of our projects, I measured roughly 2-3 extra classes per suspend function after compilation. The dispatcher overhead is also real. Dispatchers.Main uses Handler.post() which goes through the message queue. If youâre dispatching thousands of small results back to main, that queue overhead adds up. For tight UI updates, Dispatchers.Main.immediate avoids the re-dispatch if youâre already on the main thread.
After going through the internals, a few practical things clicked:
delay() is not Thread.sleep(). It suspends the state machine and schedules a resume via the dispatcher. The thread is free to do other work. This is why you can run thousands of concurrent delay() calls without thousands of threads. And if youâve ever wondered how Flow operators work under the hood, the same state machine mechanics power those too.
withContext doesnât create a new coroutine. It suspends the current one, switches the context, and resumes. This means itâs cheaper than launch + join for simple context switches. The continuation object IS the state machine. When you see Continuation<T> in coroutine internals, think âstate machine instance.â It holds the label, the saved locals, and the resume logic all in one. Structured concurrency is not optional. Using GlobalScope or creating standalone scopes bypasses the parent-child Job tree. When the ViewModel clears and those coroutines keep running, youâll wonder why your app is burning battery fetching data for a screen that no longer exists.
The moment coroutines stopped being magic for me was when I decompiled a suspend function and saw the label field and the when expression. Everything else â dispatchers, structured concurrency, cancellation â is just APIs built on top of that state machine. And once you see the machine, you can reason about every behavior coroutines exhibit.
Thanks for reading through all of this :), Happy Coding!