12 December 2025
A few months ago, we shipped a feature that loaded a userâs transaction history alongside their profile on a single screen. Everything looked fine in development â fast network, small datasets, no visible lag. Then the ANR reports started rolling in from production. Not a handful. Hundreds of them, all pointing to the same screen. The main thread was frozen for 5+ seconds on devices with slower CPUs.
My first instinct was that the network call was somehow running on Main. But it wasnât â every suspend function was wrapped in withContext(Dispatchers.IO). The actual problem was subtler. We had a JSON parsing step running on Dispatchers.IO after the network response came back. Parsing 200+ transactions with nested objects was CPU-intensive work, sitting on the IO dispatcher alongside dozens of actual blocking calls. The IO pool was saturated, the CPU-bound parsing was waiting in line, and the UI was starving because results werenât coming back fast enough. The fix was moving the parsing to Dispatchers.Default and keeping IO for actual IO. ANRs dropped to zero.
That experience taught me something I should have understood earlier: dispatchers are not interchangeable labels. They are thread pool configurations with specific sizing, scheduling characteristics, and contention behavior. Picking the wrong one doesnât just make things slower â it can starve other operations and freeze your app.
Most developers treat dispatchers as three named slots: Main for UI, IO for network/disk, Default for âeverything else.â But whatâs actually happening underneath is more interesting, and knowing it changes how you make decisions.
Both Dispatchers.IO and Dispatchers.Default are backed by the same underlying thread pool â an instance of CoroutineScheduler inside kotlinx.coroutines. They donât each create their own set of threads. The CoroutineScheduler is a work-stealing scheduler with a core pool sized to the number of CPU cores (minimum 2). When you dispatch to Dispatchers.Default, it runs on these core threads. When you dispatch to Dispatchers.IO, the scheduler uses an elasticity mechanism â it can expand the thread count up to 64 (or kotlinx.coroutines.io.parallelism if youâve set it) to handle blocking operations that would otherwise tie up the core threads.
Hereâs the thing â because they share the same scheduler, a thread that was just running an IO task can immediately pick up a Default task without any cross-pool overhead. The separation between IO and Default isnât about different thread pools. Itâs about different concurrency limits on the same pool. Default is limited to CPU core count. IO can expand beyond that to absorb blocking calls. This is why putting CPU-bound work on Dispatchers.IO is wasteful: youâre consuming one of those 64 elastic slots for work that doesnât actually block, and youâre potentially preventing real blocking IO from getting a thread.
Dispatchers.Main, on the other hand, is entirely separate. On Android, itâs backed by the main threadâs Looper and Handler. Every dispatch to Main posts a message to the queue and waits for the Looper to process it â which brings us to an important distinction.
When you use Dispatchers.Main, every resume goes through Handler.post(). Even if the coroutine is already executing on the main thread, it still posts to the message queue and waits for the next Looper cycle. Thatâs one extra dispatch â and on a busy main thread, that can mean waiting behind input events, layout passes, and view invalidations.
Dispatchers.Main.immediate checks if the coroutine is already on the main thread. If it is, it resumes immediately in the current execution context without posting to the queue. If not, it falls back to the same Handler.post() behavior. This skips one full dispatch cycle, saving roughly 50-100Îźs per dispatch depending on message queue pressure.
class TransactionViewModel(
private val repository: TransactionRepository,
) : ViewModel() {
// viewModelScope uses Dispatchers.Main.immediate by default
fun loadTransactions() {
viewModelScope.launch {
// Already on Main.immediate â no extra dispatch
_uiState.value = UiState.Loading
val transactions = withContext(Dispatchers.IO) {
repository.fetchTransactions()
}
// Resumes on Main.immediate â immediate if already on main thread
_uiState.value = UiState.Success(transactions)
}
}
}
This is why viewModelScope defaults to SupervisorJob() + Dispatchers.Main.immediate rather than plain Dispatchers.Main. Google made this choice deliberately â in animation code, one extra frame of delay between a state change and the UI update can cause visible stutter. If your coroutine updates a MutableStateFlow that drives a Compose recomposition, Main.immediate means the recomposition is triggered in the same frame rather than being pushed to the next one.
But Main.immediate isnât universally better. If you have deeply recursive suspend calls that all resolve immediately (no actual suspension), Main.immediate keeps stacking frames without ever yielding. With regular Main, each step goes through the message queue, which effectively unwinds the stack. In extreme cases â think recursive tree traversal where each node is a suspend call â Main.immediate can overflow the stack. If you suspect this is happening, yield() forces a dispatch point and breaks the recursion.
Dispatchers.Unconfined is the dispatcher that doesnât dispatch. When a coroutine starts on Unconfined, it executes immediately in the callerâs thread â no queue, no scheduling. But hereâs the part that trips people up: after the first suspension point, the coroutine resumes on whatever thread the suspending function happened to complete on. You have zero control over which thread that is.
This means if you launch on Dispatchers.Unconfined and call a suspend function that internally completes on an IO thread, your code after the suspension is now running on that IO thread. If the suspend function completes on a callback thread from a native library, youâre running there. The thread affinity is completely unpredictable after any suspension.
fun demonstrateUnconfined() {
// Starts on the calling thread (e.g., main)
CoroutineScope(Dispatchers.Unconfined).launch {
println("Before suspend: ${Thread.currentThread().name}") // main
delay(100) // suspends here
// Resumes on whatever thread the delay timer completed on
println("After suspend: ${Thread.currentThread().name}") // kotlinx.coroutines.DefaultExecutor
}
}
So when is Unconfined actually useful? Mostly in testing and event-handling pipelines where you want zero dispatch overhead and you donât care about thread identity. UnconfinedTestDispatcher in kotlinx-coroutines-test is built on this concept â it lets coroutines run eagerly so your tests donât need to manually advance time for every launch. In production code, Iâd reach for CoroutineStart.UNDISPATCHED on a specific launch instead of Unconfined on the whole scope, because UNDISPATCHED gives you the same âstart immediatelyâ behavior for the initial execution while still dispatching normally after suspension. Itâs the scoped version of the same optimization without the thread-safety landmine.
The default parallelism limit for Dispatchers.IO is 64 threads. That number is based on the assumption that IO-dispatched work is blocking â waiting on network sockets, disk reads, database queries. While a thread is blocked, itâs not using CPU, so you can have many more threads than cores. The number 64 is a practical default: high enough to keep concurrent network requests in flight, low enough to avoid excessive thread creation overhead.
The real problem developers run into isnât the 64-thread limit â itâs putting the wrong kind of work on IO. Iâve seen codebases where JSON deserialization, image decoding, and even sorting large lists all happen on Dispatchers.IO because they were âpart of the data loading pipeline.â Each of those is CPU-bound. When theyâre running alongside actual blocking calls, youâre using elastic threads for work that should run on the fixed core pool, and CPU-bound work runs slower because you get more context switching and cache thrashing than you would on a pool sized to your core count.
class TransactionRepository(
private val api: TransactionApi,
private val parser: TransactionParser,
) {
suspend fun fetchTransactions(): List<Transaction> {
// Network call â genuinely blocking IO, belongs on IO
val rawJson = withContext(Dispatchers.IO) {
api.getRawTransactions()
}
// Parsing â CPU-bound work, belongs on Default
val transactions = withContext(Dispatchers.Default) {
parser.parseTransactions(rawJson)
}
return transactions
}
}
In our production app, moving JSON parsing from IO to Default reduced P95 parse times by about 40% on mid-range devices. But how do you know when your IO pool is actually saturated? The clearest signal is a thread dump. If you capture one during a slow operation (via Android Studioâs debugger or Thread.getAllStackTraces()) and see most of your DefaultDispatcher-worker-* threads in BLOCKED or WAITING state on IO operations, youâve hit saturation. In Perfetto traces, look for gaps between coroutine task slices on the thread track â long gaps mean threads are busy elsewhere and your work is queued. Another symptom: operations that should take milliseconds suddenly take seconds, but only under load. Thatâs thread starvation â every IO slot is occupied and new work is waiting in the CoroutineSchedulerâs global queue.
Dispatchers.Default creates a thread pool equal to the number of CPU cores, with a minimum of 2. On a modern phone with 8 cores, thatâs 8 threads. This sizing is intentional â for CPU-bound work, adding more threads than cores doesnât make things faster. It makes things slower because of context switching overhead. Each context switch costs roughly 5-15Îźs on most ARM processors, and each swap can flush the CPU cache, meaning the new thread reloads data from main memory.
The CoroutineScheduler maintains a global queue and per-thread local queues. When a thread finishes its task, it first checks its local queue (fast, no contention), then tries to steal from another threadâs queue (moderate cost), and finally falls back to the global queue (requires synchronization). CPU-bound work benefits from this work-stealing design because related tasks tend to stay on the same thread, preserving cache locality. But if you mix CPU and IO work by dispatching everything to the same dispatcher, blocking IO tasks interrupt the work-stealing pattern and you lose that locality benefit.
withContext is the standard way to switch dispatchers mid-coroutine, but not every withContext call actually switches threads. When you call withContext with the same dispatcher the coroutine is already running on, the coroutines library takes a fast path â it skips the dispatch entirely and just runs the block inline. No thread switch, no queue, no scheduling overhead. This is why withContext(Dispatchers.Default) { withContext(Dispatchers.Default) { ... } } doesnât cost you two context switches. The inner call is essentially a no-op from a threading perspective.
When the dispatchers are different, withContext suspends the current coroutine, dispatches the block to the target dispatcherâs queue, and then dispatches back to the original dispatcher when the block completes. Thatâs two dispatches for one withContext call â one there, one back. At 50-100Îźs per dispatch on a Pixel 7, a single withContext that actually changes threads costs you 100-200Îźs round-trip. Consider a screen that loads 10 items from a paginated API, where each item goes through IO fetch â Default parse â Main render. Thatâs 30 dispatcher switches per page load. At 100Îźs each, youâre spending 3ms just on dispatching overhead. On a 16ms frame budget, thatâs nearly 20% spent on thread coordination.
This is why I donât recommend wrapping every single function in withContext. If you have a chain of operations that all belong on the same dispatcher, keep them in one block. The overhead of unnecessary context switches is small individually but adds up in hot paths.
Before limitedParallelism, developers created custom dispatchers with Executors.newFixedThreadPool(n).asCoroutineDispatcher(). This created entirely separate thread pools â those threads couldnât be shared with anything else. limitedParallelism solves this by creating a view over the parent dispatcher, not a new thread pool. It limits how many coroutines from this view can run concurrently, but the actual threads come from the parent pool.
class AppDispatchers {
// Limits database operations to 4 concurrent coroutines
// but uses threads from the IO pool
val databaseDispatcher = Dispatchers.IO.limitedParallelism(4)
// Limits file write operations to 2 concurrent coroutines
val fileWriteDispatcher = Dispatchers.IO.limitedParallelism(2)
// For heavy computation that shouldn't starve other Default work
val imageProcessingDispatcher = Dispatchers.Default.limitedParallelism(2)
}
The databaseDispatcher limits database concurrency to 4, which protects SQLite from too many concurrent writers (SQLite serializes writes anyway, so more threads just means more lock contention). The imageProcessingDispatcher limits CPU-intensive image work to 2 threads so it doesnât monopolize the Default pool and starve other computational work.
But limitedParallelism isnât free. It adds a coordination layer â a semaphore-like mechanism that tracks how many coroutines are currently active in the view. Each dispatch checks this counter, and if the limit is reached, the coroutine is queued until a slot opens. In most Android apps this overhead is negligible, but IMO itâs good to know youâre trading a small amount of dispatch latency for better resource control. For limitedParallelism(1) as a single-writer pattern, this is essentially a coroutine-based mutex â works well, though a real Mutex might be more readable for that specific use case.
After debugging enough dispatcher-related issues, hereâs how I think about choosing dispatchers in practice:
Network calls, database queries, file reads/writes â Dispatchers.IO. These operations block the thread while waiting for external resources. Thatâs exactly what the elastic pool is designed for.
JSON parsing, list sorting, image processing, encryption â Dispatchers.Default. These are CPU-bound. The core-sized pool gives them better performance than the oversized IO pool.
UI state updates, triggering recompositions â Dispatchers.Main.immediate (which viewModelScope already provides). Use plain Dispatchers.Main only if you specifically want to defer execution to the next message queue cycle.
Rate-limiting a specific subsystem â limitedParallelism on the appropriate parent dispatcher. Database writes? Dispatchers.IO.limitedParallelism(4). Heavy computation that shouldnât starve the rest? Dispatchers.Default.limitedParallelism(2).
Donât create standalone thread pools via Executors.newFixedThreadPool().asCoroutineDispatcher() unless you have a very specific reason. Prefer limitedParallelism to keep threads shared and utilization high.
One last thing â always inject your dispatchers. Hardcoding Dispatchers.IO throughout your codebase makes testing painful because you canât swap in TestDispatcher. Wrapping dispatchers in an injectable class means your tests run on UnconfinedTestDispatcher or StandardTestDispatcher, giving you deterministic control over coroutine execution without flaky timing issues.
class AppCoroutineDispatchers(
val main: CoroutineDispatcher = Dispatchers.Main.immediate,
val io: CoroutineDispatcher = Dispatchers.IO,
val default: CoroutineDispatcher = Dispatchers.Default,
val database: CoroutineDispatcher = Dispatchers.IO.limitedParallelism(4),
)
Dispatchers are one of those things that seem simple until they arenât. They work fine with defaults for most code. But the moment your app hits real-world scale â hundreds of concurrent operations, mixed CPU and IO workloads, tight frame budgets â understanding whatâs happening underneath the API becomes the difference between an app that feels smooth and one that freezes on your usersâ devices.
Thanks for reading through all of this :), Happy Coding!