27 March 2026
Sequential suspend calls are the default behavior in coroutines, and most of the time thatâs exactly what you want. Call getUser(), wait for the result, then call getPosts(user.id) â clean, readable, no surprises. But Iâve seen too many Android screens where the loading spinner hangs for 3-4 seconds because three independent API calls run one after the other, each waiting for the previous to complete before starting. The user profile takes a second. The posts feed takes a second. The notification count takes a second. Three seconds of sequential work that could finish in one.
Parallel decomposition is the fix, and Kotlin gives you excellent tools for it â async, coroutineScope, supervisorScope, and Semaphore. But parallel work in coroutines isnât just âuse async everywhere.â The structured concurrency model changes how you think about failure, cancellation, and scope. Getting the parallelism right means understanding which scope builder to use, what happens when one of your parallel tasks blows up, and how to limit concurrency when youâre processing a list of 500 items.
By default, suspend functions run sequentially. This is a feature, not a limitation. When you write two suspend calls one after the other, the second doesnât start until the first completes. The code reads like synchronous code, which is the entire point of coroutines.
// Sequential â takes ~2 seconds
suspend fun loadProfile(userId: String): ProfileScreen {
val user = userRepository.getUser(userId) // ~1 second
val posts = postRepository.getRecentPosts(userId) // ~1 second
return ProfileScreen(user, posts)
}
If getUser and getRecentPosts are independent â neither needs the otherâs result â youâre burning a full second of wall-clock time for nothing. The fix is async, which launches each call as a concurrent coroutine and returns a Deferred you can await later.
// Parallel â takes ~1 second
suspend fun loadProfile(userId: String): ProfileScreen =
coroutineScope {
val userDeferred = async { userRepository.getUser(userId) }
val postsDeferred = async { postRepository.getRecentPosts(userId) }
ProfileScreen(userDeferred.await(), postsDeferred.await())
}
The time savings are real and measurable. On a profile screen I worked on, switching from sequential to parallel dropped the load time from 2.4 seconds to 1.1 seconds. Thatâs the difference between a user perceiving the screen as slow versus instant. But I only made the change after confirming the two calls had zero data dependencies. If getRecentPosts needed the userâs account type to determine which posts to fetch, the sequential version would be the correct one, and the parallel version would be a race condition.
One subtle point: the await() calls at the bottom donât need to be in any particular order. Both async blocks start immediately when theyâre declared. The await calls just retrieve the results â if the result is already available, await returns instantly. If not, it suspends until the result arrives. The parallelism comes from the async launch, not from the await call.
The coroutineScope function is the backbone of structured parallel decomposition. It creates a child scope that doesnât complete until every coroutine inside it finishes. More importantly, if any child fails, coroutineScope cancels all other children and rethrows the exception. This is the âall-or-nothingâ guarantee â either every parallel task succeeds, or the whole operation fails cleanly.
class CheckoutRepository(
private val inventoryApi: InventoryApi,
private val pricingApi: PricingApi,
private val shippingApi: ShippingApi,
private val dispatchers: DispatcherProvider
) {
suspend fun prepareCheckout(
cartItems: List<CartItem>,
address: Address
): CheckoutSummary = withContext(dispatchers.io) {
coroutineScope {
val inventory = async {
inventoryApi.checkAvailability(cartItems)
}
val pricing = async {
pricingApi.calculateTotal(cartItems)
}
val shipping = async {
shippingApi.estimateDelivery(cartItems, address)
}
CheckoutSummary(
availability = inventory.await(),
total = pricing.await(),
delivery = shipping.await()
)
}
}
}
If the pricing API throws an exception, coroutineScope cancels the inventory and shipping calls that are still in-flight. The caller gets a single exception, not a mix of partial results and errors. This is exactly what you want for a checkout flow â you donât want to show the user a delivery estimate if you couldnât confirm the items are in stock.
Hereâs the insight that took me a while to internalize: coroutineScope doesnât create a new coroutine. It creates a new scope within the current coroutine. The current coroutine suspends at the coroutineScope boundary until everything inside completes. This means it inherits the parentâs context (including the dispatcher), so the async blocks inside run on whatever dispatcher the parent is using. If you need a specific dispatcher for the parallel work, wrap coroutineScope with withContext like in the example above.
Internally, coroutineScope creates a new Job as a child of the current coroutineâs Job. Every async inside it creates child Jobs of that scope Job. When one child fails, the scope Job cancels all siblings and rethrows. This tree-based cancellation is what makes structured concurrency âstructuredâ â cleanup is automatic.
coroutineScope is great when all tasks must succeed together. But sometimes you have parallel work where one failure shouldnât cancel the others. A dashboard screen loading a user profile, a feed, and notifications â if the feed API is down, you still want to show the profile and notification count. This is where supervisorScope comes in.
class DashboardViewModel(
private val userRepo: UserRepository,
private val feedRepo: FeedRepository,
private val statsRepo: StatsRepository
) : ViewModel() {
fun loadDashboard() {
viewModelScope.launch {
supervisorScope {
val profile = async {
runCatching { userRepo.getProfile() }
}
val feed = async {
runCatching { feedRepo.getLatestFeed() }
}
val stats = async {
runCatching { statsRepo.getWeeklyStats() }
}
_profileState.value = profile.await()
.getOrElse { ProfileState.Error(it) }
_feedState.value = feed.await()
.getOrElse { FeedState.Error(it) }
_statsState.value = stats.await()
.getOrElse { StatsState.Error(it) }
}
}
}
}
With supervisorScope, if feedRepo.getLatestFeed() throws, the profile and stats coroutines keep running. Each widget loads independently. The user sees a profile section, an error message where the feed would be, and a stats section â much better UX than a blank screen with a generic error.
The tradeoff is that you must handle every childâs failure individually. With coroutineScope, a single try-catch around the whole block handles everything. With supervisorScope, you need runCatching or individual try-catch blocks around each await(). More resilience, more error handling code. I use supervisorScope specifically for UI sections that are visually and logically independent. For backend transactions where partial success is dangerous, I stick with coroutineScope.
Thereâs a common mistake I want to flag: donât use SupervisorJob() as a context element when you mean supervisorScope. Iâve seen code like coroutineScope { launch(SupervisorJob()) { ... } } â this actually breaks structured concurrency because SupervisorJob() creates a new root job that isnât a child of the scopeâs job. The coroutine wonât be cancelled when the scope is cancelled, which defeats the purpose. If you need supervisor behavior, use supervisorScope { } â it creates the correct parent-child relationship while isolating failures.
Processing a list of items in parallel is one of the most common patterns, and the most common source of resource exhaustion bugs. The naive approach â list.map { async { process(it) } }.awaitAll() â launches every item concurrently. For a list of 10 items, thatâs fine. For 500 items hitting an API, you just DDoSâd your own backend.
// Dangerous for large lists â launches ALL items concurrently
suspend fun fetchAllArticles(ids: List<String>): List<Article> =
coroutineScope {
ids.map { id ->
async { articleApi.fetchArticle(id) }
}.awaitAll()
}
For small lists with a known size, this pattern is perfectly fine and I use it regularly. But when the list size is dynamic or potentially large, you need to limit concurrency with a Semaphore.
suspend fun fetchAllArticles(
ids: List<String>,
maxConcurrency: Int = 5
): List<Article> = coroutineScope {
val semaphore = Semaphore(maxConcurrency)
ids.map { id ->
async {
semaphore.withPermit {
articleApi.fetchArticle(id)
}
}
}.awaitAll()
}
Semaphore(5) means at most 5 coroutines execute the API call at the same time. The other coroutines are launched but suspend at semaphore.withPermit until a permit becomes available. This gives you parallel processing with backpressure â you get the speed benefits without overwhelming the server or running out of connections.
I typically set maxConcurrency between 3 and 10 for network calls. For CPU-bound work on Dispatchers.Default, the dispatcher itself limits parallelism to the number of CPU cores, so a semaphore is less critical. But for IO-bound work on Dispatchers.IO â which defaults to a pool of 64 threads â a semaphore prevents you from opening 500 simultaneous HTTP connections.
One thing awaitAll() gives you that manual await() calls donât: if any single item fails, awaitAll() immediately throws, and because weâre inside coroutineScope, all other in-flight items get cancelled. If you need partial results â some items succeed, some fail â combine supervisorScope with runCatching inside each async:
suspend fun fetchArticlesSafely(
ids: List<String>,
maxConcurrency: Int = 5
): List<Result<Article>> = supervisorScope {
val semaphore = Semaphore(maxConcurrency)
ids.map { id ->
async {
semaphore.withPermit {
runCatching { articleApi.fetchArticle(id) }
}
}
}.awaitAll()
}
Now each item returns a Result<Article>, and failures in individual items donât cancel the rest. The caller decides how to handle partial success â retry the failures, show what succeeded, or treat any failure as a total failure.
Sometimes you want parallel work with an overall deadline. Wrapping coroutineScope with withTimeout gives you this â if the deadline passes, all children are cancelled.
suspend fun loadDashboardFast(userId: String): DashboardData =
withTimeout(3_000L) {
coroutineScope {
val profile = async { userRepo.getProfile(userId) }
val orders = async { orderRepo.getRecent(userId) }
DashboardData(profile.await(), orders.await())
}
}
If either API call is slow and the combined time exceeds 3 seconds, both get cancelled. For a fallback approach, swap withTimeout for withTimeoutOrNull and provide cached data when the deadline passes.
When you have multiple ways to get the same data and want whichever responds first, use select with async:
suspend fun getConfig(): AppConfig = coroutineScope {
val remote = async { configApi.fetchRemote() }
val local = async { configDao.getCached() }
select {
remote.onAwait { it }
local.onAwait { it }
}.also {
coroutineContext.cancelChildren()
}
}
select suspends until one of the clauses completes, then returns that result. cancelChildren() cleans up the loser. This is useful for config loading where the local cache is fast but might be stale, and the remote source is authoritative but slow. Whichever arrives first wins.
For screens where you want to show data as it arrives rather than waiting for everything:
class SearchViewModel(
private val webSearch: WebSearchRepository,
private val localSearch: LocalSearchRepository,
private val imageSearch: ImageSearchRepository
) : ViewModel() {
fun search(query: String) {
viewModelScope.launch {
supervisorScope {
launch {
val results = webSearch.search(query)
_webResults.value = results
}
launch {
val results = localSearch.search(query)
_localResults.value = results
}
launch {
val results = imageSearch.search(query)
_imageResults.value = results
}
}
}
}
}
Each launch updates its own state independently. The UI renders results as they arrive â local results might appear in 100ms, web results in 800ms, images in 1.5 seconds. Using launch instead of async because we donât need a combined return value â each coroutine pushes its result directly to a StateFlow.
Question 1: You have three independent API calls inside coroutineScope using async. The second call throws an exception after the first has already completed. What happens?
Explanation:
coroutineScopeenforces all-or-nothing semantics. When any child fails, it cancels all remaining children (the third call) and rethrows the exception. Even though the first call completed successfully,coroutineScopedoesnât return partial results â the entire operation fails. UsesupervisorScopeif you need independent failure handling.
Question 2: Whatâs wrong with this code?
val results = ids.map { id ->
GlobalScope.async { api.fetch(id) }
}.awaitAll()
GlobalScope.async is slower than regular asyncawaitAll() doesnât work with GlobalScopeGlobalScope runs on the wrong dispatcherExplanation:
GlobalScopecreates coroutines with no parent Job. They arenât cancelled when the calling scope is cancelled, which brings back the fire-and-forget leaks that structured concurrency was designed to prevent. UsecoroutineScope { ids.map { async { ... } }.awaitAll() }to keep everything inside a structured scope.
Build an ImageProcessor that takes a list of image URLs and processes them in parallel with a concurrency limit. Requirements:
List<String> of URLs and a maxConcurrency parameter (default 4)maxConcurrency simultaneous operationssupervisorScope so that one failed download doesnât cancel the othersList<ProcessedResult> where each result is either Success(bitmap) or Failure(url, error)The solution should use Semaphore for concurrency limiting and withTimeout for per-item deadlines, all within a structured scope.
Thanks for reading!