πŸš€

Android Performance Mastery

Optimize every layer β€” startup time, memory management, rendering performance, Compose optimization, battery efficiency, APK size, network tuning, threading, micro-optimizations, and profiling with Android Studio tools.

πŸ“Š Intermediate to Expert πŸ“¦ 10 Modules πŸ“– 55 Lessons ⏱️ 12 weeks
Performance Android Optimization
What You'll Learn
βœ“ Reduce app startup time with lazy initialization, Baseline Profiles, and App Startup library
βœ“ Profile and fix memory leaks using LeakCanary, Memory Profiler, and heap analysis
βœ“ Optimize rendering β€” eliminate jank, overdraw, and layout bottlenecks in Views and Compose
βœ“ Master Jetpack Compose performance β€” stability, recomposition skipping, and lazy layouts
βœ“ Shrink APK size with R8, resource optimization, and dynamic delivery
βœ“ Improve battery efficiency with WorkManager, job scheduling, and wake lock management
βœ“ Optimize network calls with caching, compression, and connection pooling
βœ“ Use Android Studio Profiler, Perfetto, systrace, and Macrobenchmark for data-driven optimization
βœ“ Apply micro-optimizations β€” data structures, inline functions, object pooling, and allocation reduction
βœ“ Tune database performance with Room indexing, transactions, and query optimization
Course Curriculum
10 modules Β· 55 lessons Β· 12 weeks

Module 1: Performance Fundamentals and Profiling Mindset

Performance isn’t about micro-optimizations or cargo-culting best practices from blog posts. It’s about understanding where time and resources are actually spent and making data-driven decisions. The fastest code is code you never write, and the most impactful optimization is the one that addresses the real bottleneck β€” not the one you guessed was slow. This module builds the mental framework you need before touching a single line of code.

Lesson 1.1: The Performance Budget

Every app has a performance budget whether you define one or not. Users define it with their patience. Google’s research shows that 53% of users abandon apps that take longer than 3 seconds to load. The budget isn’t arbitrary β€” it’s rooted in human perception thresholds that have been studied for decades.

The core budgets break down into measurable categories. App startup (cold start) should complete in under 500ms to feel instant β€” anything over 2 seconds feels broken and triggers abandonment. Frame rendering gets exactly 16.67ms per frame at 60fps, or 11.11ms at 90fps. Miss that window and users see jank β€” stuttering animations that destroy the perception of quality. Touch response latency under 100ms feels instantaneous, while anything over 300ms feels laggy and unresponsive. Network requests should deliver first meaningful content within 1-2 seconds on a good connection.

APK size is a budget most developers ignore, but it directly impacts acquisition. Google’s own data shows that every 6MB increase in APK size reduces install conversion by approximately 1%. For an app with millions of potential installs, that translates to tens of thousands of lost users. Beyond initial download, larger APKs consume more storage, take longer to update, and use more bandwidth β€” hitting users in emerging markets the hardest.

// Define your performance budgets as constants
object PerformanceBudget {
    const val COLD_START_MAX_MS = 500L
    const val WARM_START_MAX_MS = 200L
    const val FRAME_BUDGET_60FPS_MS = 16.67
    const val FRAME_BUDGET_90FPS_MS = 11.11
    const val TOUCH_RESPONSE_MAX_MS = 100L
    const val NETWORK_FIRST_CONTENT_MAX_MS = 2000L
    const val APK_SIZE_MAX_MB = 15

    fun isFrameDropped(renderTimeMs: Double, targetFps: Int = 60): Boolean {
        val budget = 1000.0 / targetFps
        return renderTimeMs > budget
    }
}

Setting explicit budgets changes how your team works. Instead of vague goals like β€œmake it faster,” you get testable criteria: β€œcold start must stay under 500ms on a Pixel 4a.” You can automate these checks in CI, catch regressions before they reach users, and make performance a first-class feature rather than an afterthought.

Key takeaway: Define explicit, measurable performance budgets for startup time, frame rendering, touch response, network latency, and APK size. Measure against these budgets continuously β€” automated performance tests catch regressions before users feel them.

Lesson 1.2: Android Studio Profiler β€” Your Primary Weapon

The Android Studio Profiler is the single most important tool in your performance toolkit. It provides real-time visibility into CPU usage, memory allocation, network activity, and energy consumption β€” all synchronized on a single timeline. You access it through View β†’ Tool Windows β†’ Profiler, or by clicking the Profile button (not Run) to launch your app with profiling enabled.

The profiler operates in two modes. Standard profiling attaches to a running process and provides basic metrics with minimal overhead. Advanced profiling enables detailed tracking of network requests, event timelines, and custom events, but adds more overhead. For release builds, you need to mark your app as profileable in the manifest to enable profiling without debug overhead β€” this is critical because debug builds run 3-10x slower than release builds, making profiling results misleading.

// In AndroidManifest.xml β€” make release builds profileable
// <application>
//     <profileable android:shell="true"
//         tools:targetApi="29" />
// </application>

// Custom trace sections for profiling specific code paths
import android.os.Trace

fun loadUserProfile(userId: String): UserProfile {
    Trace.beginSection("loadUserProfile")
    try {
        Trace.beginSection("fetchFromCache")
        val cached = cache.get(userId)
        Trace.endSection() // fetchFromCache

        if (cached != null) return cached

        Trace.beginSection("fetchFromNetwork")
        val profile = api.fetchProfile(userId)
        Trace.endSection() // fetchFromNetwork

        Trace.beginSection("saveToCache")
        cache.put(userId, profile)
        Trace.endSection() // saveToCache

        return profile
    } finally {
        Trace.endSection() // loadUserProfile
    }
}

The CPU Profiler deserves special attention because it reveals thread behavior through color-coded timelines. Green means the thread is actively executing on the CPU. Yellow means the thread is runnable but waiting β€” for I/O completion, lock acquisition, or CPU availability. Grey indicates the thread is sleeping or inactive. Red signals a blocked or deadlocked thread. When you see your main thread spending significant time in yellow or red states, you’ve found a performance problem β€” something is blocking the UI thread that shouldn’t be.

The profiler supports two tracing mechanisms: System Trace captures low-level system events across all processes, showing you the full picture of what the device is doing. Method Trace records detailed call stacks with execution times, letting you drill into exactly which methods consume the most time. System Trace has lower overhead and is better for identifying general bottlenecks. Method Trace has higher overhead but gives you the precision to find the exact line of code causing problems.

Key takeaway: Always profile with release (or profileable) builds β€” debug builds are 3-10x slower and produce misleading results. Use System Trace for broad performance analysis and Method Trace for drilling into specific bottlenecks.

Lesson 1.3: Perfetto and System Tracing

Perfetto is the successor to systrace and provides the most detailed system-wide performance analysis available on Android. Unlike the Android Studio Profiler which focuses on your app, Perfetto captures everything happening on the device β€” CPU scheduling, GPU rendering, kernel events, binder transactions, and your app’s trace points β€” all on a synchronized timeline. This matters because performance problems often cross process boundaries.

You capture Perfetto traces through the command line or through Android Studio’s System Trace feature. The trace files can be analyzed in the Perfetto UI (ui.perfetto.dev) which provides a powerful web-based visualization. For automated capture, you can use the perfetto command-line tool on the device or the Record Trace feature built into developer options.

// Adding custom trace points that appear in Perfetto
import androidx.tracing.Trace as AndroidXTrace

// Simple inline tracing
inline fun <T> trace(label: String, block: () -> T): T {
    AndroidXTrace.beginSection(label)
    try {
        return block()
    } finally {
        AndroidXTrace.endSection()
    }
}

// Usage β€” these sections appear in Perfetto timeline
fun initializeApp() {
    trace("App.initDagger") {
        daggerComponent = DaggerAppComponent.create()
    }

    trace("App.initDatabase") {
        database = Room.databaseBuilder(
            context,
            AppDatabase::class.java,
            "app.db"
        ).build()
    }

    trace("App.initNetworking") {
        retrofit = Retrofit.Builder()
            .baseUrl(BASE_URL)
            .client(okHttpClient)
            .build()
    }
}

When analyzing Perfetto traces, focus on the main thread’s frame timeline. Each frame is represented as a slice β€” green frames completed within budget, red frames missed the deadline. Click on a red frame to see exactly what work was happening during that frame. You can trace through the entire rendering pipeline: Choreographer#doFrame β†’ traversal β†’ measure β†’ layout β†’ draw β†’ syncAndDraw. Any slice that takes too long identifies exactly where the bottleneck is.

Perfetto also reveals scheduling issues that are invisible in app-level profiling. If your main thread is in a runnable state (waiting for CPU) for several milliseconds, the bottleneck isn’t your code β€” it’s CPU contention from other processes. If you see frequent context switches, your thread is being preempted. These system-level insights are impossible to get from the Android Studio Profiler alone.

Key takeaway: Perfetto provides system-wide visibility that app-level profilers cannot match. Use custom trace sections liberally in your code β€” they have negligible overhead in release builds and provide invaluable data when you need to diagnose performance issues.

Lesson 1.4: Macrobenchmark and Microbenchmark Libraries

Android’s Jetpack Benchmark libraries let you automate performance measurement and catch regressions in CI. There are two libraries serving different purposes. Macrobenchmark measures app-level operations β€” startup time, scroll performance, animation smoothness β€” by driving your real app through instrumentation tests. Microbenchmark measures code-level performance β€” function execution time, allocation counts β€” in an isolated test environment.

Macrobenchmark is the more impactful of the two. It launches your actual app (not a test harness), performs real user interactions, and measures real-world metrics like time-to-initial-display, time-to-full-display, and frame timing during scrolls. Because it runs your release build, the measurements reflect what users actually experience.

// Macrobenchmark β€” measure cold startup time
@RunWith(AndroidJUnit4::class)
class StartupBenchmark {

    @get:Rule
    val benchmarkRule = MacrobenchmarkRule()

    @Test
    fun startupColdCompilation() {
        benchmarkRule.measureRepeated(
            packageName = "com.example.myapp",
            metrics = listOf(StartupTimingMetric()),
            compilationMode = CompilationMode.None(), // worst case
            iterations = 5,
            startupMode = StartupMode.COLD,
            setupBlock = {
                pressHome()
                dropKernelPageCache()
            }
        ) {
            startActivityAndWait()
        }
    }

    @Test
    fun scrollPerformance() {
        benchmarkRule.measureRepeated(
            packageName = "com.example.myapp",
            metrics = listOf(FrameTimingMetric()),
            compilationMode = CompilationMode.Partial(),
            iterations = 5,
            startupMode = StartupMode.WARM
        ) {
            startActivityAndWait()
            val list = device.findObject(By.res("feed_list"))
            list.setGestureMargin(device.displayWidth / 5)
            list.fling(Direction.DOWN)
            list.fling(Direction.DOWN)
            list.fling(Direction.UP)
        }
    }
}

// Microbenchmark β€” measure function execution time
@RunWith(AndroidJUnit4::class)
class JsonParsingBenchmark {

    @get:Rule
    val benchmarkRule = BenchmarkRule()

    @Test
    fun benchmarkMoshiParsing() {
        val json = loadTestJson()
        val adapter = moshi.adapter(UserResponse::class.java)

        benchmarkRule.measureRepeated {
            val result = adapter.fromJson(json)
        }
    }

    @Test
    fun benchmarkKotlinxSerializationParsing() {
        val json = loadTestJson()

        benchmarkRule.measureRepeated {
            val result = Json.decodeFromString<UserResponse>(json)
        }
    }
}

Integrate Macrobenchmark into your CI pipeline to catch performance regressions automatically. Each benchmark run produces metrics that you can compare against previous runs. If cold startup time increases by more than 50ms between commits, your CI fails and the team investigates before the regression reaches production. This approach transforms performance from a reactive firefight into a proactive quality gate.

The key difference between the two libraries is scope and overhead. Macrobenchmark runs your full app on a real device with minimal instrumentation overhead β€” the measurements are realistic. Microbenchmark runs isolated code in a test harness with JIT warmup and garbage collection pauses factored out β€” the measurements are precise but don’t reflect real-world conditions. Use Macrobenchmark for user-facing metrics and Microbenchmark for comparing implementation alternatives.

Key takeaway: Macrobenchmark measures what users experience β€” startup time, scroll smoothness, animation quality. Microbenchmark measures isolated code performance. Integrate both into CI to catch regressions automatically before they reach production.

Lesson 1.5: Baseline Profiles and Startup Profiles

Baseline Profiles are one of the highest-impact performance optimizations available on Android. They improve code execution speed by approximately 30% from first launch by telling the Android Runtime (ART) which code paths to pre-compile using Ahead-of-Time (AOT) compilation. Without Baseline Profiles, ART interprets your code on first run and gradually JIT-compiles hot paths over time β€” meaning your app is slowest when users first install it, exactly when first impressions matter most.

A Baseline Profile is a text file listing methods and classes that should be pre-compiled. You generate it by running your app through critical user journeys β€” startup, navigation, scrolling, common actions β€” and recording which code paths execute. The profile is shipped with your APK to the Play Store. During installation, ART reads the profile and AOT-compiles the listed methods, so they execute at full speed from the very first launch.

// BaselineProfileGenerator.kt β€” generates baseline profiles
@RunWith(AndroidJUnit4::class)
class BaselineProfileGenerator {

    @get:Rule
    val rule = BaselineProfileRule()

    @Test
    fun generateBaselineProfile() {
        rule.collect(
            packageName = "com.example.myapp",
            maxIterations = 5,
            stableIterations = 3
        ) {
            // Cold start
            pressHome()
            startActivityAndWait()

            // Navigate to key screens
            device.findObject(By.text("Feed")).click()
            device.waitForIdle()

            // Scroll the main list
            val list = device.findObject(By.res("feed_list"))
            list.fling(Direction.DOWN)
            list.fling(Direction.DOWN)

            // Navigate to detail
            device.findObject(By.res("item_card")).click()
            device.waitForIdle()

            // Navigate to search
            device.findObject(By.res("search_button")).click()
            device.waitForIdle()
            device.findObject(By.res("search_input")).text = "test"
            device.waitForIdle()
        }
    }
}

Startup Profiles are closely related but serve a different purpose. While Baseline Profiles optimize runtime execution speed, Startup Profiles optimize the DEX file layout to improve startup time. They tell the build system which classes are loaded during startup, so those classes can be grouped together in the DEX file. This reduces the number of page faults during startup because classes that load together are stored together on disk.

Both profiles work together for maximum impact. A well-configured app ships both a Baseline Profile (for runtime speed) and a Startup Profile (for DEX layout optimization). Libraries can also ship their own Baseline Profiles β€” AndroidX libraries already include them. When your app depends on libraries with Baseline Profiles, the profiles are merged during build time, so the entire dependency graph benefits from pre-compilation.

// build.gradle.kts β€” configure baseline profiles
plugins {
    id("com.android.application")
    id("androidx.baselineprofile")
}

android {
    defaultConfig {
        // Enable profile-guided optimization
        baselineProfile {
            automaticGenerationDuringBuild = true
            dexLayoutOptimization = true // Startup Profile
        }
    }
}

dependencies {
    baselineProfile(project(":baselineprofile"))
}

Key takeaway: Baseline Profiles deliver a 30% speed improvement from the very first launch by pre-compiling hot code paths with AOT compilation. Combine them with Startup Profiles for DEX layout optimization. Generate profiles from real user journeys and ship them with every release.

Quiz: Performance Fundamentals

What is the frame budget at 60fps?

  • ❌ 33.33ms
  • βœ… 16.67ms
  • ❌ 8.33ms
  • ❌ 11.11ms

At 60 frames per second, each frame has exactly 1000ms / 60 = 16.67ms to complete all work including layout, drawing, and GPU rendering. Missing this budget causes visible jank.

What does a yellow thread state indicate in the CPU Profiler?

  • ❌ The thread is actively executing code
  • ❌ The thread is sleeping
  • βœ… The thread is runnable but waiting for CPU, I/O, or locks
  • ❌ The thread is deadlocked

Yellow indicates a runnable state β€” the thread has work to do but is waiting for resources. This could mean I/O blocking, lock contention, or CPU scheduling delays. Excessive yellow on the main thread signals a performance problem.

What do Baseline Profiles optimize?

  • ❌ Build time
  • ❌ APK size
  • βœ… Code execution speed from first launch using AOT compilation
  • ❌ Network request latency

Baseline Profiles ship pre-compiled method lists with the APK. ART uses these to AOT-compile critical code paths during installation, delivering approximately 30% faster execution from the very first app launch.

Which tracing mechanism has lower overhead?

  • βœ… System Trace
  • ❌ Method Trace
  • ❌ Both have identical overhead
  • ❌ Neither β€” tracing always has high overhead

System Trace captures system-level events with minimal overhead, making it suitable for production-like profiling. Method Trace records every method entry and exit, adding significant overhead but providing more detailed information.

Why should you profile with release builds?

  • ❌ Debug builds don’t support profiling
  • ❌ Release builds have more features
  • βœ… Debug builds run 3-10x slower, producing misleading results
  • ❌ The profiler only works with signed APKs

Debug builds disable optimizations, enable additional runtime checks, and run interpreted bytecode. This makes them 3-10x slower than release builds, meaning performance bottlenecks you find in debug may not exist in release, and real bottlenecks may be masked.

Coding Challenge: Custom Performance Tracker

Build a lightweight performance tracker that measures and logs method execution times, detects dropped frames, and reports violations of your performance budget.

// Challenge: Implement a PerformanceTracker that:
// 1. Tracks method execution time with custom trace sections
// 2. Detects when execution exceeds a specified budget
// 3. Logs warnings for budget violations
// 4. Supports nested tracking
// 5. Reports statistics (min, max, average) for tracked sections

Solution

import android.os.SystemClock
import android.os.Trace
import android.util.Log
import java.util.concurrent.ConcurrentHashMap

class PerformanceTracker private constructor() {

    private val metrics = ConcurrentHashMap<String, MutableList<Long>>()
    private val activeTraces = ThreadLocal<ArrayDeque<TraceEntry>>()

    data class TraceEntry(
        val label: String,
        val startTimeNs: Long,
        val budgetMs: Long
    )

    data class Stats(
        val label: String,
        val count: Int,
        val minMs: Double,
        val maxMs: Double,
        val avgMs: Double,
        val p95Ms: Double
    )

    fun beginSection(label: String, budgetMs: Long = Long.MAX_VALUE) {
        Trace.beginSection(label)
        val stack = activeTraces.get() ?: ArrayDeque<TraceEntry>().also {
            activeTraces.set(it)
        }
        stack.addLast(TraceEntry(label, System.nanoTime(), budgetMs))
    }

    fun endSection(): Long {
        val stack = activeTraces.get()
            ?: throw IllegalStateException("No active trace section")
        val entry = stack.removeLast()
        Trace.endSection()

        val durationNs = System.nanoTime() - entry.startTimeNs
        val durationMs = durationNs / 1_000_000L

        metrics.getOrPut(entry.label) { mutableListOf() }.add(durationMs)

        if (durationMs > entry.budgetMs) {
            Log.w(
                TAG,
                "PERF VIOLATION: ${entry.label} took ${durationMs}ms " +
                    "(budget: ${entry.budgetMs}ms, exceeded by ${durationMs - entry.budgetMs}ms)"
            )
        }

        return durationMs
    }

    inline fun <T> track(
        label: String,
        budgetMs: Long = Long.MAX_VALUE,
        block: () -> T
    ): T {
        beginSection(label, budgetMs)
        try {
            return block()
        } finally {
            endSection()
        }
    }

    fun getStats(label: String): Stats? {
        val durations = metrics[label] ?: return null
        val sorted = durations.sorted()
        return Stats(
            label = label,
            count = sorted.size,
            minMs = sorted.first().toDouble(),
            maxMs = sorted.last().toDouble(),
            avgMs = sorted.average(),
            p95Ms = sorted[(sorted.size * 0.95).toInt().coerceAtMost(sorted.lastIndex)].toDouble()
        )
    }

    fun reportAll(): String = buildString {
        appendLine("=== Performance Report ===")
        metrics.keys.sorted().forEach { label ->
            getStats(label)?.let { stats ->
                appendLine(
                    "${stats.label}: count=${stats.count}, " +
                        "min=${stats.minMs}ms, max=${stats.maxMs}ms, " +
                        "avg=${"%.2f".format(stats.avgMs)}ms, " +
                        "p95=${stats.p95Ms}ms"
                )
            }
        }
    }

    fun clear() = metrics.clear()

    companion object {
        private const val TAG = "PerfTracker"
        val instance by lazy { PerformanceTracker() }
    }
}

Module 2: App Startup Optimization

App startup is the first impression your app makes. Users form quality judgments within the first second of interaction, and a slow startup poisons every subsequent experience. Android measures three types of startup: cold start (process not running, slowest), warm start (process exists but Activity needs recreation), and hot start (Activity exists, just brought to foreground). This module focuses on cold start because it’s the worst case and the one that benefits most from optimization.

Lesson 2.1: Understanding Cold Start Internals

A cold start is the most expensive operation your app performs. The system must create a new process, initialize the Application object, create the launch Activity, inflate the layout, and render the first frame. Understanding each phase is essential because different optimizations target different phases β€” and optimizing the wrong phase wastes effort.

The cold start sequence begins when the user taps your app icon. The system forks the Zygote process (a pre-warmed process with common framework classes already loaded), which creates your app’s process. Then the system calls Application.onCreate(), where most apps initialize their dependency injection framework, networking libraries, analytics SDKs, and crash reporting tools. This is typically the longest phase of startup and the one you have the most control over.

After Application.onCreate() completes, the system creates your launch Activity, calls onCreate(), onStart(), and onResume(), inflates the layout, measures and lays out the view hierarchy, and draws the first frame. The time from process creation to the first frame being visible is reported as β€œTime to Initial Display” (TTID). The time until the app is fully interactive β€” all data loaded, all views populated β€” is β€œTime to Full Display” (TTFD).

// Reporting Time to Full Display manually
class MainActivity : ComponentActivity() {

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        // Tell the system to wait for reportFullyDrawn()
        // before considering startup complete
        setContentView(R.layout.activity_main)

        lifecycleScope.launch {
            val userData = loadUserData()
            val feedItems = loadFeedItems()

            displayContent(userData, feedItems)

            // Signal that the app is fully drawn and interactive
            reportFullyDrawn()
        }
    }
}

You can measure cold start time using adb commands. The am start-activity -W command launches your app and reports three metrics: TotalTime (time from intent to first frame), WaitTime (time the system waited), and the launch timestamp. Run this multiple times and average the results β€” individual runs vary due to system load, disk caching, and thermal throttling.

// Measure cold start from the command line:
// adb shell am force-stop com.example.myapp
// adb shell am start-activity -W -n com.example.myapp/.MainActivity
//
// Output:
// Status: ok
// LaunchState: COLD
// Activity: com.example.myapp/.MainActivity
// TotalTime: 487
// WaitTime: 489

Key takeaway: Cold start involves process creation, Application initialization, Activity creation, layout inflation, and first frame rendering. Measure TTID and TTFD separately β€” TTID tells you when users see content, TTFD tells you when they can interact with it.

Lesson 2.2: Lazy Initialization and Deferred Loading

The single most effective startup optimization is doing less work during startup. Every library initialization, database open, and network client creation that happens in Application.onCreate() directly delays the first frame. The solution is lazy initialization β€” defer work until it’s actually needed.

Kotlin’s lazy delegate is purpose-built for this pattern. A lazy property isn’t initialized until it’s first accessed. The initialization code runs exactly once, the result is cached, and subsequent accesses return the cached value. By default, lazy uses LazyThreadSafetyMode.SYNCHRONIZED, which adds locking overhead. If you know the property will only be accessed from one thread, use LazyThreadSafetyMode.NONE to avoid the synchronization cost.

class MyApplication : Application() {

    // ❌ Eager initialization β€” all of this runs during startup
    // val database = Room.databaseBuilder(this, AppDb::class.java, "app.db").build()
    // val retrofit = Retrofit.Builder().baseUrl(URL).build()
    // val analytics = Analytics.init(this)

    // βœ… Lazy initialization β€” runs only when first accessed
    val database by lazy {
        Room.databaseBuilder(this, AppDatabase::class.java, "app.db")
            .build()
    }

    val retrofit by lazy {
        Retrofit.Builder()
            .baseUrl(BASE_URL)
            .client(okHttpClient)
            .addConverterFactory(MoshiConverterFactory.create())
            .build()
    }

    val analytics by lazy(LazyThreadSafetyMode.NONE) {
        Analytics.Builder(this, ANALYTICS_KEY)
            .trackApplicationLifecycleEvents()
            .build()
    }

    override fun onCreate() {
        super.onCreate()
        // Only initialize what's absolutely required for first frame
        initCrashReporting() // Must be first to catch startup crashes
        initStrictMode()     // Debug-only, zero cost in release
    }
}

For dependency injection frameworks like Hilt or Dagger, the component creation itself can be expensive because it instantiates all eagerly-bound dependencies. Use @Inject lateinit var and provider injection (Provider<T> or Lazy<T> from Dagger) to defer object creation. A Provider<T> creates a new instance each time you call get(), while Lazy<T> creates the instance on first get() and caches it β€” similar to Kotlin’s lazy but integrated with the DI graph.

Content providers are a hidden startup cost. Every ContentProvider declared in your manifest (including those from libraries) has its onCreate() called before Application.onCreate(). Some libraries use ContentProviders for automatic initialization β€” Firebase, WorkManager, and others. The App Startup library solves this by consolidating multiple initializations into a single ContentProvider, reducing the overhead.

// App Startup library β€” consolidate ContentProvider initializations
class WorkManagerInitializer : Initializer<WorkManager> {
    override fun create(context: Context): WorkManager {
        val config = Configuration.Builder()
            .setMinimumLoggingLevel(Log.INFO)
            .build()
        WorkManager.initialize(context, config)
        return WorkManager.getInstance(context)
    }

    override fun dependencies(): List<Class<out Initializer<*>>> = emptyList()
}

class AnalyticsInitializer : Initializer<Analytics> {
    override fun create(context: Context): Analytics {
        return Analytics.Builder(context, BuildConfig.ANALYTICS_KEY)
            .build()
    }

    // Analytics depends on WorkManager being initialized first
    override fun dependencies(): List<Class<out Initializer<*>>> =
        listOf(WorkManagerInitializer::class.java)
}

Key takeaway: Move every initialization that isn’t required for the first frame to lazy or deferred loading. Use Kotlin’s lazy delegate, Dagger’s Lazy<T> and Provider<T>, and the App Startup library to consolidate ContentProvider initializations.

Lesson 2.3: Splash Screen Optimization

Android 12 introduced a standardized splash screen API that replaced the old pattern of using a dedicated SplashActivity with a theme-based approach. The new API shows a splash screen automatically during cold and warm starts, giving you a branded loading experience while your app initializes. Understanding how to work with this API β€” not against it β€” is key to a smooth startup experience.

The splash screen uses your app’s theme to display an icon and background color while the app loads. It animates away once the first frame is drawn. You can customize the icon, background color, icon background, and the exit animation. The key optimization is controlling when the splash screen dismisses β€” you want it to stay visible until your content is ready, not until the framework decides to remove it.

class MainActivity : ComponentActivity() {

    private var isReady = false

    override fun onCreate(savedInstanceState: Bundle?) {
        val splashScreen = installSplashScreen()
        super.onCreate(savedInstanceState)

        // Keep the splash screen visible until data is loaded
        splashScreen.setKeepOnScreenCondition { !isReady }

        // Customize the exit animation
        splashScreen.setOnExitAnimationListener { splashScreenView ->
            val fadeOut = ObjectAnimator.ofFloat(
                splashScreenView.view,
                View.ALPHA,
                1f, 0f
            ).apply {
                duration = 300L
                interpolator = DecelerateInterpolator()
                doOnEnd { splashScreenView.remove() }
            }
            fadeOut.start()
        }

        setContent {
            val viewModel: MainViewModel = viewModel()
            val uiState by viewModel.uiState.collectAsStateWithLifecycle()

            LaunchedEffect(uiState) {
                if (uiState is UiState.Ready) {
                    isReady = true
                }
            }

            MyAppTheme {
                MainScreen(uiState)
            }
        }
    }
}

The splash screen introduces a tradeoff. Keeping it visible too long frustrates users who want to see content. Removing it too early shows an empty or loading screen that feels broken. The optimal approach is to preload the minimum data needed for a meaningful first screen β€” user profile, cached feed items, feature flags β€” and dismiss the splash screen once that data is available. Background-load everything else after the splash screen is gone.

A common mistake is performing expensive operations synchronously before calling setContentView() or setContent(). This blocks the first frame and extends the splash screen duration unnecessarily. Instead, start async loading immediately in onCreate(), show a lightweight skeleton UI as your first frame, and populate the real content when data arrives. Users perceive a skeleton screen as faster than a splash screen because it signals that content is loading.

@HiltViewModel
class MainViewModel @Inject constructor(
    private val userRepository: UserRepository,
    private val feedRepository: FeedRepository
) : ViewModel() {

    private val _uiState = MutableStateFlow<UiState>(UiState.Loading)
    val uiState: StateFlow<UiState> = _uiState.asStateFlow()

    init {
        viewModelScope.launch {
            try {
                // Load cached data first for fast display
                val cachedUser = userRepository.getCachedUser()
                val cachedFeed = feedRepository.getCachedFeed()

                if (cachedUser != null && cachedFeed.isNotEmpty()) {
                    _uiState.value = UiState.Ready(cachedUser, cachedFeed)
                }

                // Then refresh from network in background
                val freshUser = userRepository.refreshUser()
                val freshFeed = feedRepository.refreshFeed()
                _uiState.value = UiState.Ready(freshUser, freshFeed)

            } catch (e: Exception) {
                _uiState.value = UiState.Error(e.message ?: "Unknown error")
            }
        }
    }
}

Key takeaway: Use the Android 12 splash screen API with setKeepOnScreenCondition to hold the splash screen until your minimum viable content is loaded. Show cached data first for instant display, then refresh from the network in the background.

Lesson 2.4: Window and Theme Optimization

Before your first Activity even starts inflating layouts, the system draws a preview window using your Activity’s theme. This is the β€œstarting window” that appears immediately when the user taps your app icon. Optimizing this window creates the perception of instant launch even when your actual initialization takes hundreds of milliseconds.

The starting window uses your theme’s windowBackground attribute. If your theme has a white or colored background, users see that color instantly. If you set windowBackground to a drawable that matches your app’s first screen layout β€” a toolbar area at the top, content area below β€” the transition from starting window to actual content feels seamless. This is the technique that makes apps feel like they launch in zero time.

// res/values/themes.xml
// <style name="Theme.MyApp.Starting" parent="Theme.Material3.DayNight">
//     <item name="android:windowBackground">@drawable/starting_window</item>
//     <item name="android:windowDrawsSystemBarBackgrounds">true</item>
//     <item name="android:statusBarColor">@color/primary</item>
// </style>
//
// res/drawable/starting_window.xml
// <layer-list>
//     <item android:drawable="@color/surface" />
//     <item android:gravity="center"
//           android:width="48dp"
//           android:height="48dp"
//           android:drawable="@drawable/ic_app_logo" />
// </layer-list>

// In Activity β€” switch to the real theme before setContentView
class MainActivity : ComponentActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        // Switch from starting window theme to actual app theme
        setTheme(R.style.Theme_MyApp)
        super.onCreate(savedInstanceState)
        setContent { MyApp() }
    }
}

Layout inflation is another significant startup cost. Complex XML layouts with deep view hierarchies take longer to inflate because each view must be instantiated through reflection (LayoutInflater uses Class.forName() for every custom view), configured with attributes, and added to the view tree. Compose eliminates this cost entirely because composable functions are regular Kotlin functions β€” no reflection, no XML parsing.

If you’re stuck with XML layouts, reduce inflation cost by flattening your hierarchy. Replace nested LinearLayout chains with ConstraintLayout. Use ViewStub for views that aren’t immediately visible β€” a ViewStub costs almost nothing to inflate because it’s a zero-size view that inflates its actual layout only when made visible. Merge tags eliminate redundant ViewGroup layers when including layouts.

// ViewStub β€” defer inflation of non-essential UI
class ProfileActivity : AppCompatActivity() {

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_profile)

        // Main profile info loads immediately
        displayBasicProfile(user)

        // Error view only inflates if needed
        if (hasError) {
            val errorStub = findViewById<ViewStub>(R.id.error_stub)
            val errorView = errorStub.inflate()
            errorView.findViewById<TextView>(R.id.error_message).text = errorMsg
        }
    }
}

Key takeaway: Optimize your theme’s windowBackground to create the illusion of instant launch. Flatten view hierarchies and use ViewStub for deferred inflation. Compose eliminates XML inflation overhead entirely β€” consider migrating critical startup screens first.

Lesson 2.5: Multi-Process and Background Initialization

Some apps run multiple processes β€” a main process for the UI and separate processes for services, ContentProviders, or push notification handling. Each process creation triggers Application.onCreate(), which means all your initialization code runs again in each process. This wastes memory and CPU time because the background process doesn’t need your UI frameworks, and the UI process doesn’t need your sync service dependencies.

The fix is process-aware initialization. Check the current process name in Application.onCreate() and only initialize what that specific process needs. The main process gets the full initialization. Background processes get only the minimum required for their function.

class MyApplication : Application() {

    override fun onCreate() {
        super.onCreate()

        when (getProcessName()) {
            packageName -> initMainProcess()
            "$packageName:sync" -> initSyncProcess()
            "$packageName:notification" -> initNotificationProcess()
            else -> initMinimal()
        }
    }

    private fun initMainProcess() {
        // Full initialization for UI process
        initCrashReporting()
        initDependencyInjection()
        initAnalytics()
        initImageLoading()
    }

    private fun initSyncProcess() {
        // Minimal initialization for sync worker
        initCrashReporting()
        initDatabase()
        initNetworking()
    }

    private fun initNotificationProcess() {
        // Notification handling only
        initCrashReporting()
        initNotificationChannels()
    }

    private fun initMinimal() {
        initCrashReporting()
    }

    private fun getProcessName(): String {
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.P) {
            return Application.getProcessName()
        }
        val pid = android.os.Process.myPid()
        val manager = getSystemService(ACTIVITY_SERVICE) as ActivityManager
        return manager.runningAppProcesses
            ?.firstOrNull { it.pid == pid }
            ?.processName ?: packageName
    }
}

For the main process itself, you can parallelize independent initialization tasks using coroutines. Instead of initializing database, then networking, then analytics sequentially (total time = sum of all), launch them concurrently and wait for all to complete (total time = longest individual task). This can cut startup time dramatically when you have multiple independent initializations each taking 50-200ms.

class MyApplication : Application() {

    override fun onCreate() {
        super.onCreate()

        val startupScope = CoroutineScope(
            Dispatchers.Default + SupervisorJob()
        )

        // Launch independent initializations in parallel
        startupScope.launch {
            val dbJob = async { initDatabase() }
            val networkJob = async { initNetworking() }
            val analyticsJob = async { initAnalytics() }
            val imageJob = async { initImageLoading() }

            // Wait for critical path only
            dbJob.await()
            networkJob.await()

            // Non-critical can complete whenever
            // analyticsJob and imageJob continue in background
        }
    }
}

Key takeaway: Process-aware initialization prevents wasted work in multi-process apps. Parallelize independent initializations with coroutines to reduce total startup time from the sum of all tasks to the duration of the longest single task.

Lesson 2.6: Measuring and Monitoring Startup in Production

Lab measurements (profiler, benchmarks) tell you how your app performs in controlled conditions. Production measurements tell you how it performs for real users on real devices with real network conditions. The gap between lab and production can be enormous β€” a Pixel 7 Pro on Wi-Fi is nothing like a Samsung Galaxy A03 on a spotty 3G connection. You need both types of measurement.

Android provides the reportFullyDrawn() API to mark when your app is truly ready for interaction. The system measures TTID automatically (time to first frame), but TTFD (time to full display) requires your explicit signal. Call reportFullyDrawn() when your primary content is loaded and the user can interact with the app. This metric appears in Android Vitals on the Play Console.

class MainActivity : ComponentActivity() {

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)

        // Track startup phases with custom trace points
        val startupTracer = StartupTracer()

        setContent {
            val viewModel: MainViewModel = viewModel()
            val uiState by viewModel.uiState.collectAsStateWithLifecycle()

            LaunchedEffect(uiState) {
                when (uiState) {
                    is UiState.Ready -> {
                        startupTracer.markFullyDrawn()
                        reportFullyDrawn()
                    }
                    else -> {}
                }
            }

            MyApp(uiState)
        }
    }
}

class StartupTracer {
    private val processStartTime = getProcessStartTime()
    private val phases = mutableMapOf<String, Long>()

    fun markPhase(name: String) {
        phases[name] = SystemClock.elapsedRealtime()
    }

    fun markFullyDrawn() {
        val totalTime = SystemClock.elapsedRealtime() - processStartTime
        Log.i(TAG, "Startup completed in ${totalTime}ms")
        phases.forEach { (name, time) ->
            Log.i(TAG, "  Phase $name: ${time - processStartTime}ms")
        }
        // Report to your analytics backend
        Analytics.trackStartup(totalTime, phases)
    }

    private fun getProcessStartTime(): Long {
        return if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.N) {
            val uptimeMs = SystemClock.elapsedRealtime()
            val processStartMs = android.os.Process.getStartElapsedRealtime()
            processStartMs
        } else {
            SystemClock.elapsedRealtime()
        }
    }

    companion object {
        private const val TAG = "StartupTracer"
    }
}

Android Vitals on the Google Play Console provides aggregated startup metrics across your entire user base, broken down by device model, Android version, and country. The metrics include cold start time, warm start time, and the percentage of starts that exceed the β€œexcessive” threshold (5 seconds for cold start, 2 seconds for warm start). If your app exceeds these thresholds for a significant percentage of users, your Play Store listing may be demoted.

Set up automated alerting on startup metrics. If P95 cold start time increases by more than 100ms between releases, trigger an investigation. Often the culprit is a new library initialization, a new ContentProvider from a dependency, or a database migration that runs on the main thread during startup.

Key takeaway: Measure startup in production using reportFullyDrawn(), Android Vitals, and custom analytics. Lab measurements on high-end devices don’t represent your user base. Monitor P50, P90, and P95 startup times and alert on regressions.

Quiz: App Startup Optimization

What is the difference between TTID and TTFD?

  • ❌ TTID is faster than TTFD because it measures network time
  • βœ… TTID measures time to first frame, TTFD measures time until the app is fully interactive
  • ❌ TTID measures warm start, TTFD measures cold start
  • ❌ They measure the same thing on different devices

TTID (Time to Initial Display) is measured automatically and represents when the first frame is drawn. TTFD (Time to Full Display) requires calling reportFullyDrawn() and represents when your content is loaded and the user can interact with the app.

What thread safety mode should you use for lazy when the property is only accessed from one thread?

  • ❌ SYNCHRONIZED
  • βœ… NONE
  • ❌ PUBLICATION
  • ❌ Thread safety mode doesn’t matter

LazyThreadSafetyMode.NONE skips synchronization overhead. The default SYNCHRONIZED mode adds locking, which is unnecessary when you know the property will only be accessed from a single thread (like the main thread).

What is the benefit of the App Startup library?

  • ❌ It makes your app start faster by skipping initialization
  • βœ… It consolidates multiple ContentProvider initializations into a single ContentProvider
  • ❌ It removes the splash screen
  • ❌ It pre-loads activities

Many libraries use ContentProviders for automatic initialization, each adding startup overhead. App Startup consolidates these into a single ContentProvider with explicit dependency ordering, reducing the number of ContentProvider instantiations.

Why is parallel initialization effective for startup?

  • ❌ It uses less memory
  • ❌ It makes each initialization faster
  • βœ… Total time equals the longest task instead of the sum of all tasks
  • ❌ It skips unnecessary initializations

When independent initializations run concurrently, the total time is limited by the slowest task rather than the sum of all tasks. Three 100ms initializations running in parallel complete in 100ms, not 300ms.

Coding Challenge: Startup Optimizer

Create a startup initialization framework that supports lazy loading, parallel execution, dependency ordering, and timing measurement.

// Challenge: Build a StartupInitializer that:
// 1. Registers initialization tasks with dependencies
// 2. Runs independent tasks in parallel
// 3. Respects dependency ordering
// 4. Measures and reports timing for each task
// 5. Supports lazy (on-demand) vs eager initialization

Solution

class StartupInitializer(
    private val scope: CoroutineScope = CoroutineScope(Dispatchers.Default)
) {
    private val tasks = mutableMapOf<String, InitTask>()
    private val results = ConcurrentHashMap<String, Deferred<Any?>>()
    private val timings = ConcurrentHashMap<String, Long>()

    data class InitTask(
        val name: String,
        val dependencies: List<String> = emptyList(),
        val isLazy: Boolean = false,
        val block: suspend () -> Any?
    )

    fun register(
        name: String,
        dependencies: List<String> = emptyList(),
        lazy: Boolean = false,
        block: suspend () -> Any?
    ) {
        tasks[name] = InitTask(name, dependencies, lazy, block)
    }

    suspend fun initialize(): Map<String, Long> {
        val eagerTasks = tasks.filter { !it.value.isLazy }

        eagerTasks.forEach { (name, _) ->
            ensureStarted(name)
        }

        // Wait for all eager tasks
        eagerTasks.forEach { (name, _) ->
            results[name]?.await()
        }

        return timings.toMap()
    }

    private fun ensureStarted(name: String): Deferred<Any?> {
        return results.getOrPut(name) {
            val task = tasks[name]
                ?: throw IllegalArgumentException("Unknown task: $name")

            scope.async {
                // Wait for dependencies first
                task.dependencies.forEach { dep ->
                    ensureStarted(dep).await()
                }

                val startTime = SystemClock.elapsedRealtime()
                val result = task.block()
                timings[name] = SystemClock.elapsedRealtime() - startTime
                Log.d("Startup", "$name completed in ${timings[name]}ms")
                result
            }
        }
    }

    @Suppress("UNCHECKED_CAST")
    suspend fun <T> get(name: String): T {
        val deferred = ensureStarted(name)
        return deferred.await() as T
    }

    fun reportTimings(): String = buildString {
        appendLine("=== Startup Timing Report ===")
        timings.entries.sortedBy { it.value }.forEach { (name, time) ->
            appendLine("  $name: ${time}ms")
        }
        appendLine("  Total wall time: ${timings.values.maxOrNull() ?: 0}ms")
        appendLine("  Total CPU time: ${timings.values.sum()}ms")
    }
}

// Usage
class MyApplication : Application() {
    val initializer = StartupInitializer()

    override fun onCreate() {
        super.onCreate()

        initializer.register("crashReporting") {
            CrashReporter.init(this@MyApplication)
        }

        initializer.register("database") {
            Room.databaseBuilder(this@MyApplication, AppDb::class.java, "app.db").build()
        }

        initializer.register("networking", dependencies = listOf("crashReporting")) {
            OkHttpClient.Builder().build()
        }

        initializer.register("analytics", lazy = true) {
            Analytics.init(this@MyApplication)
        }

        CoroutineScope(Dispatchers.Main.immediate).launch {
            val timings = initializer.initialize()
            Log.i("Startup", initializer.reportTimings())
        }
    }
}

Module 3: Memory Management

Memory is the silent killer of Android apps. Unlike startup time where users notice immediately, memory problems accumulate invisibly until the app crashes with an OutOfMemoryError, starts stuttering from excessive garbage collection, or gets killed by the system’s low-memory killer. Understanding how Android manages memory β€” and how your code interacts with the garbage collector β€” is essential for building apps that stay stable under sustained use.

Lesson 3.1: Android Memory Model and Garbage Collection

Every Android app runs in a Dalvik Virtual Machine (or ART on modern devices) with a limited heap. The heap size varies by device β€” low-end devices may have 64MB, flagships may have 512MB or more. When your app’s memory usage approaches the heap limit, the garbage collector runs more frequently, causing pauses that manifest as dropped frames and UI jank. If you exceed the limit, the system throws OutOfMemoryError and your app crashes.

ART uses a generational garbage collector that divides the heap into regions. Young generation holds recently allocated objects β€” most objects die young, so collecting this region is fast. Old generation holds objects that survived multiple GC cycles β€” collecting this region is more expensive. The GC must pause all threads (stop-the-world) to safely identify live objects, though modern ART minimizes these pauses through concurrent collection.

// Monitor memory usage in your app
fun logMemoryState() {
    val runtime = Runtime.getRuntime()
    val maxMemory = runtime.maxMemory() / 1024 / 1024 // MB
    val totalMemory = runtime.totalMemory() / 1024 / 1024
    val freeMemory = runtime.freeMemory() / 1024 / 1024
    val usedMemory = totalMemory - freeMemory

    Log.d("Memory", buildString {
        appendLine("Max heap: ${maxMemory}MB")
        appendLine("Allocated: ${totalMemory}MB")
        appendLine("Used: ${usedMemory}MB")
        appendLine("Free: ${freeMemory}MB")
        appendLine("Usage: ${usedMemory * 100 / maxMemory}%")
    })
}

// React to system memory pressure
class MyApplication : Application() {
    override fun onTrimMemory(level: Int) {
        super.onTrimMemory(level)
        when (level) {
            TRIM_MEMORY_RUNNING_LOW -> {
                // App is running but system is low on memory
                imageCache.trimToSize(imageCache.size() / 2)
            }
            TRIM_MEMORY_RUNNING_CRITICAL -> {
                // System is critically low β€” release everything possible
                imageCache.evictAll()
                dataCache.clear()
            }
            TRIM_MEMORY_UI_HIDDEN -> {
                // App went to background β€” release UI resources
                imageCache.trimToSize(imageCache.size() / 4)
            }
        }
    }
}

GC pauses directly impact performance. Each GC cycle takes 2-50ms depending on heap size and the amount of live data. If GC runs during a frame’s 16.67ms budget, you drop that frame. Frequent allocations in hot paths β€” scroll handlers, animation callbacks, draw methods β€” trigger frequent GC cycles. The solution is reducing allocations, not avoiding GC. You can’t disable the garbage collector, but you can minimize its workload.

The Memory Profiler in Android Studio shows real-time heap usage, allocation tracking, and GC events. The jagged sawtooth pattern in the memory graph is normal β€” memory grows as objects are allocated, then drops when GC collects unused objects. Warning signs include: the sawtooth peaks getting progressively higher (memory leak), GC events happening every few seconds (excessive allocation), or memory usage approaching the heap limit.

Key takeaway: ART uses generational garbage collection with stop-the-world pauses. Reduce GC pressure by minimizing allocations in hot paths β€” scroll handlers, animation callbacks, and draw methods. Monitor memory with the Memory Profiler and respond to onTrimMemory() callbacks.

Lesson 3.2: Memory Leaks β€” Detection and Prevention

A memory leak occurs when objects that are no longer needed continue to be referenced, preventing the garbage collector from reclaiming their memory. The most common Android memory leak is holding a reference to an Activity after it’s been destroyed. Since Activities reference their entire view hierarchy, a leaked Activity can retain megabytes of memory.

The classic leak patterns are well-known. Static references to Activities or Views survive configuration changes and process lifecycle. Inner classes (including anonymous classes) hold an implicit reference to their outer class β€” if the outer class is an Activity and the inner class outlives it (posted as a Runnable, stored in a singleton), the Activity leaks. Handler references are particularly dangerous because messages in the queue hold a reference to the Handler, which holds a reference to the Activity.

// ❌ Memory leak β€” anonymous Runnable holds reference to Activity
class LeakyActivity : AppCompatActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)

        // This Runnable holds an implicit reference to LeakyActivity
        // If the Activity is destroyed before 10 seconds, it leaks
        Handler(Looper.getMainLooper()).postDelayed({
            updateUI() // 'this' reference to Activity
        }, 10_000)
    }
}

// βœ… Fixed β€” use lifecycleScope which cancels with Activity
class SafeActivity : AppCompatActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)

        lifecycleScope.launch {
            delay(10_000)
            updateUI() // Automatically cancelled if Activity is destroyed
        }
    }
}

// ❌ Memory leak β€” singleton holds Activity reference
object ImageLoader {
    private var context: Context? = null // Holds Activity context!

    fun init(context: Context) {
        this.context = context // If Activity context, it leaks
    }
}

// βœ… Fixed β€” use Application context for singletons
object ImageLoader {
    private var context: Context? = null

    fun init(context: Context) {
        this.context = context.applicationContext // Never leaks
    }
}

LeakCanary is the standard tool for detecting memory leaks in development. It automatically monitors Activities, Fragments, Views, ViewModels, and Services for leaks. When it detects a leak, it captures a heap dump, analyzes the reference chain, and shows you exactly which object is holding the reference that prevents garbage collection. Install it as a debug-only dependency so it never ships to production.

// build.gradle.kts
dependencies {
    debugImplementation("com.squareup.leakcanary:leakcanary-android:2.14")
}

// LeakCanary starts automatically β€” no code needed
// It watches:
// - Activities after onDestroy()
// - Fragments after onDestroyView() and onDestroy()
// - ViewModels after onCleared()
// - Views after they're removed from the window

// Custom object watching β€” for objects you know should be GC'd
class SessionManager {
    fun logout() {
        val currentSession = session
        session = null

        // Tell LeakCanary to watch this object
        AppWatcher.objectWatcher.expectWeaklyReachable(
            currentSession,
            "Session should be GC'd after logout"
        )
    }
}

Beyond LeakCanary, the Memory Profiler’s heap dump analysis shows all objects on the heap organized by class. Look for objects with unexpectedly high retained sizes β€” the retained size includes all objects that would be GC’d if this object were collected. A single Activity with a 50MB retained size means 50MB of memory can’t be freed because of that one reference.

Key takeaway: The most common Android memory leak is holding a reference to an Activity after destruction. Use lifecycleScope instead of Handlers for delayed work. Use Application context in singletons. Install LeakCanary in debug builds to catch leaks automatically.

Lesson 3.3: Bitmap and Image Memory

Bitmaps are the largest memory consumers in most Android apps. A single 1080x1920 image in ARGB_8888 format (the default) consumes 1080 Γ— 1920 Γ— 4 bytes = 8.3MB of memory. Load a handful of these β€” a photo gallery, a social media feed β€” and you’ve consumed a significant portion of your heap. Understanding bitmap memory is essential for any image-heavy app.

The first optimization is loading bitmaps at the size you need, not the size they exist as. If you’re displaying an image in a 200x200dp ImageView, loading a 4000x3000 photo at full resolution wastes 95% of the memory. Use BitmapFactory.Options with inSampleSize to downsample during decode. The inSampleSize must be a power of 2 β€” the decoder uses it to skip pixels during decode, making both decoding faster and memory usage smaller.

// Calculate the optimal sample size for a target view size
fun decodeSampledBitmap(
    resources: Resources,
    resId: Int,
    reqWidth: Int,
    reqHeight: Int
): Bitmap {
    // First, decode bounds only (no memory allocation)
    val options = BitmapFactory.Options().apply {
        inJustDecodeBounds = true
    }
    BitmapFactory.decodeResource(resources, resId, options)

    // Calculate inSampleSize
    options.apply {
        inSampleSize = calculateInSampleSize(this, reqWidth, reqHeight)
        inJustDecodeBounds = false
        inPreferredConfig = Bitmap.Config.RGB_565 // Half memory if no alpha
    }

    return BitmapFactory.decodeResource(resources, resId, options)
}

fun calculateInSampleSize(
    options: BitmapFactory.Options,
    reqWidth: Int,
    reqHeight: Int
): Int {
    val (height, width) = options.outHeight to options.outWidth
    var inSampleSize = 1

    if (height > reqHeight || width > reqWidth) {
        val halfHeight = height / 2
        val halfWidth = width / 2

        while (halfHeight / inSampleSize >= reqHeight &&
               halfWidth / inSampleSize >= reqWidth) {
            inSampleSize *= 2
        }
    }

    return inSampleSize
}

Bitmap configurations control the bytes per pixel. ARGB_8888 uses 4 bytes per pixel and supports full transparency. RGB_565 uses 2 bytes per pixel (half the memory) but has no alpha channel and reduced color depth. For photos and images without transparency, RGB_565 is often sufficient and halves memory usage. HARDWARE bitmaps (API 26+) store pixel data in GPU memory instead of the JVM heap, completely eliminating heap pressure for bitmaps that are only rendered and never read pixel-by-pixel.

Image loading libraries like Coil, Glide, and Picasso handle all of these optimizations automatically. They calculate the target view size, downsample the image, cache decoded bitmaps in memory and on disk, and recycle unused bitmaps. Never load bitmaps manually unless you have a specific reason β€” the libraries handle edge cases (orientation EXIF data, animated GIFs, progressive loading) that are easy to get wrong.

// Coil β€” modern Kotlin-first image loading
// In Compose
@Composable
fun UserAvatar(imageUrl: String) {
    AsyncImage(
        model = ImageRequest.Builder(LocalContext.current)
            .data(imageUrl)
            .crossfade(true)
            .size(Size(200, 200)) // Request exact size
            .memoryCachePolicy(CachePolicy.ENABLED)
            .diskCachePolicy(CachePolicy.ENABLED)
            .build(),
        contentDescription = "User avatar",
        modifier = Modifier
            .size(48.dp)
            .clip(CircleShape),
        contentScale = ContentScale.Crop
    )
}

// Coil with custom memory cache size
class MyApplication : Application(), ImageLoaderFactory {
    override fun newImageLoader(): ImageLoader {
        return ImageLoader.Builder(this)
            .memoryCache {
                MemoryCache.Builder(this)
                    .maxSizePercent(0.20) // 20% of available memory
                    .build()
            }
            .diskCache {
                DiskCache.Builder()
                    .directory(cacheDir.resolve("image_cache"))
                    .maxSizeBytes(100L * 1024 * 1024) // 100MB
                    .build()
            }
            .build()
    }
}

Key takeaway: Always load bitmaps at the size you need, never at full resolution. Use RGB_565 for images without transparency and HARDWARE bitmaps for render-only images. Use Coil or Glide β€” they handle downsampling, caching, and memory management automatically.

Lesson 3.4: Object Allocation and Pooling

Every object allocation has a cost. The allocation itself is fast β€” ART uses a bump pointer allocator in the young generation, which is essentially incrementing a pointer. But every allocation creates work for the garbage collector later. In hot paths that execute thousands of times per second β€” scroll callbacks, touch event handlers, custom drawing code β€” even cheap allocations add up to significant GC pressure.

The key insight is that allocations in tight loops are the enemy, not allocations in general. Creating a UserProfile object when navigating to a profile screen is fine. Creating a Rect object inside onDraw() that executes 60 times per second is a problem β€” that’s 60 Rect allocations per second, 3,600 per minute, each becoming garbage immediately.

// ❌ Allocations inside onDraw β€” creates GC pressure
class CustomView(context: Context) : View(context) {
    override fun onDraw(canvas: Canvas) {
        val paint = Paint()  // New allocation every frame
        val rect = RectF()   // New allocation every frame
        val path = Path()    // New allocation every frame

        paint.color = Color.BLUE
        rect.set(0f, 0f, width.toFloat(), height.toFloat())
        canvas.drawRect(rect, paint)
    }
}

// βœ… Pre-allocate and reuse objects
class CustomView(context: Context) : View(context) {
    private val paint = Paint().apply { color = Color.BLUE }
    private val rect = RectF()
    private val path = Path()

    override fun onDraw(canvas: Canvas) {
        rect.set(0f, 0f, width.toFloat(), height.toFloat())
        path.reset() // Reuse by resetting
        canvas.drawRect(rect, paint)
    }
}

Object pooling takes reuse further by maintaining a pool of pre-created objects. Instead of allocating and garbage collecting objects, you borrow from the pool and return when done. Android itself uses this pattern extensively β€” Message.obtain() pulls from a pool of Message objects, MotionEvent objects are pooled, and ViewHolder in RecyclerView is a form of object pooling.

// Generic object pool implementation
class ObjectPool<T>(
    private val maxSize: Int = 20,
    private val factory: () -> T,
    private val reset: (T) -> Unit = {}
) {
    private val pool = ArrayDeque<T>(maxSize)

    fun obtain(): T {
        return synchronized(pool) {
            pool.removeLastOrNull()
        } ?: factory()
    }

    fun recycle(obj: T) {
        reset(obj)
        synchronized(pool) {
            if (pool.size < maxSize) {
                pool.addLast(obj)
            }
        }
    }

    inline fun <R> use(block: (T) -> R): R {
        val obj = obtain()
        try {
            return block(obj)
        } finally {
            recycle(obj)
        }
    }
}

// Usage β€” pool of Rect objects for drawing
class ChartView(context: Context) : View(context) {
    private val rectPool = ObjectPool(
        maxSize = 50,
        factory = { RectF() },
        reset = { it.setEmpty() }
    )

    override fun onDraw(canvas: Canvas) {
        for (bar in chartData) {
            rectPool.use { rect ->
                rect.set(bar.left, bar.top, bar.right, bar.bottom)
                canvas.drawRect(rect, barPaint)
            }
        }
    }
}

Kotlin-specific allocation traps include lambda captures (each lambda that captures variables creates an anonymous class instance), string templates in loops (each creates a new String), and iterator objects from for loops over non-array collections. Use forEach on arrays (which compiles to an indexed loop), buildString for string concatenation, and inline functions to eliminate lambda allocation overhead.

Key takeaway: Pre-allocate objects used in hot paths like onDraw(), scroll callbacks, and animation updates. Use object pooling for frequently created and discarded objects. Watch for hidden allocations from lambdas, string templates, and iterators in tight loops.

Lesson 3.5: WeakReferences, SoftReferences, and Caching

Java and Kotlin provide reference types beyond the default strong reference that give you control over how the garbage collector treats your objects. Understanding these reference types is essential for building memory-efficient caches and avoiding leaks.

A strong reference (the default) prevents the referenced object from being garbage collected. As long as any strong reference chain exists from a GC root to an object, that object stays in memory. A WeakReference does not prevent garbage collection β€” the referenced object can be collected at any time during any GC cycle. A SoftReference is similar but the GC only collects soft-referenced objects when memory is actually needed β€” it tries to keep them around as long as possible.

// WeakReference β€” useful for breaking reference cycles and caches
// that shouldn't prevent GC
class EventBus {
    // Listeners are held weakly β€” if the subscriber is GC'd,
    // the reference becomes null automatically
    private val listeners = mutableListOf<WeakReference<EventListener>>()

    fun subscribe(listener: EventListener) {
        listeners.add(WeakReference(listener))
    }

    fun publish(event: Event) {
        val iterator = listeners.iterator()
        while (iterator.hasNext()) {
            val ref = iterator.next()
            val listener = ref.get()
            if (listener != null) {
                listener.onEvent(event)
            } else {
                iterator.remove() // Clean up dead references
            }
        }
    }
}

// SoftReference β€” useful for memory-sensitive caches
class BitmapCache(private val maxEntries: Int = 100) {
    private val cache = LinkedHashMap<String, SoftReference<Bitmap>>(
        maxEntries, 0.75f, true // accessOrder = true for LRU
    )

    fun put(key: String, bitmap: Bitmap) {
        cache[key] = SoftReference(bitmap)
        if (cache.size > maxEntries) {
            val eldest = cache.entries.first()
            cache.remove(eldest.key)
        }
    }

    fun get(key: String): Bitmap? {
        val ref = cache[key] ?: return null
        val bitmap = ref.get()
        if (bitmap == null) {
            cache.remove(key) // Referent was GC'd
        }
        return bitmap
    }
}

For most caching needs, use LruCache instead of building your own with SoftReferences. LruCache provides a fixed-size cache with least-recently-used eviction. You specify the maximum size (in bytes, count, or any unit), and the cache automatically evicts the least-recently-accessed entries when the limit is reached. This gives you predictable memory usage without relying on GC behavior.

// LruCache β€” predictable memory-bounded cache
class ThumbnailCache : LruCache<String, Bitmap>(calculateMaxSize()) {

    override fun sizeOf(key: String, bitmap: Bitmap): Int {
        return bitmap.byteCount // Size in bytes
    }

    override fun entryRemoved(
        evicted: Boolean,
        key: String,
        oldValue: Bitmap,
        newValue: Bitmap?
    ) {
        // Optionally move evicted entries to a disk cache
        if (evicted) {
            diskCache.put(key, oldValue)
        }
    }

    companion object {
        private fun calculateMaxSize(): Int {
            val maxMemory = Runtime.getRuntime().maxMemory() / 1024
            return (maxMemory / 8).toInt() // 1/8th of available memory
        }
    }
}

The general rule is: use WeakReference when you want to reference an object without preventing its collection (breaking reference cycles, optional caches). Use SoftReference when you want the GC to keep the object as long as possible but allow collection under memory pressure (large caches). Use LruCache when you need predictable, bounded memory usage with automatic eviction.

Key takeaway: Use WeakReference to avoid preventing garbage collection of objects you don’t own. Use LruCache for predictable memory-bounded caching β€” it’s more reliable than SoftReference because eviction behavior is deterministic rather than dependent on GC implementation.

Lesson 3.6: Memory Profiler Deep Dive

The Memory Profiler in Android Studio provides three levels of analysis: real-time monitoring, allocation tracking, and heap dump analysis. Each level provides progressively more detail at the cost of higher overhead. Knowing when to use each level saves debugging time.

Real-time monitoring shows the memory graph β€” total heap usage over time with GC events marked. Look for the sawtooth pattern. Normal behavior shows regular peaks and valleys as objects are allocated and collected. Red flags include peaks that grow over time (memory leak), very frequent GC events (excessive allocation), or sudden large jumps (loading large resources).

Allocation tracking records every object allocation with its call stack. Enable it for a short period (a few seconds) while reproducing the performance issue β€” scrolling a list, navigating between screens. Then examine the allocations sorted by count or size. Look for objects created in unexpected quantities β€” thousands of Rect objects during a scroll, hundreds of String objects during a data load. The call stacks show you exactly which code is creating them.

// Using Debug class for programmatic heap analysis
fun analyzeHeapHealth(): HeapReport {
    // Force GC before analysis to get accurate numbers
    Runtime.getRuntime().gc()
    Thread.sleep(100) // Give GC time to complete

    val runtime = Runtime.getRuntime()
    val nativeHeap = Debug.getNativeHeapSize()
    val nativeAllocated = Debug.getNativeHeapAllocatedSize()

    return HeapReport(
        javaHeapMax = runtime.maxMemory(),
        javaHeapAllocated = runtime.totalMemory(),
        javaHeapUsed = runtime.totalMemory() - runtime.freeMemory(),
        nativeHeapTotal = nativeHeap,
        nativeHeapUsed = nativeAllocated,
        gcCount = Debug.getRuntimeStat("art.gc.gc-count")?.toLongOrNull() ?: 0,
        gcTime = Debug.getRuntimeStat("art.gc.gc-time")?.toLongOrNull() ?: 0
    )
}

data class HeapReport(
    val javaHeapMax: Long,
    val javaHeapAllocated: Long,
    val javaHeapUsed: Long,
    val nativeHeapTotal: Long,
    val nativeHeapUsed: Long,
    val gcCount: Long,
    val gcTime: Long
) {
    val javaUsagePercent: Int
        get() = ((javaHeapUsed * 100) / javaHeapMax).toInt()

    val isUnderPressure: Boolean
        get() = javaUsagePercent > 80

    override fun toString(): String = buildString {
        appendLine("Java Heap: ${javaHeapUsed / 1024 / 1024}MB / ${javaHeapMax / 1024 / 1024}MB (${javaUsagePercent}%)")
        appendLine("Native Heap: ${nativeHeapUsed / 1024 / 1024}MB / ${nativeHeapTotal / 1024 / 1024}MB")
        appendLine("GC Count: $gcCount, Total GC Time: ${gcTime}ms")
        if (isUnderPressure) appendLine("⚠️ Memory pressure detected!")
    }
}

Heap dump analysis is the most powerful tool for finding memory leaks. Capture a heap dump, then examine objects by class. Sort by retained size to find the objects holding the most memory. Click on an instance to see its reference chain β€” the path of references from GC roots to the object. If you see an Activity instance after you’ve navigated away from it, trace the reference chain to find what’s holding it. The chain always reveals the leak source: a static field, a registered listener, a running coroutine, or a cached reference.

When analyzing heap dumps for leaks, compare two dumps: one before performing an action (opening a screen) and one after (closing it and forcing GC). Any objects that exist in the second dump but shouldn’t β€” Activities you navigated away from, Fragments you popped, ViewModels that should have been cleared β€” are potential leaks. The reference chain explains why.

Key takeaway: Use real-time monitoring for ongoing health checks, allocation tracking for finding excessive allocations in hot paths, and heap dumps for diagnosing memory leaks. Compare heap dumps before and after user actions to identify leaked objects and trace their reference chains.

Quiz: Memory Management

What is the default Bitmap configuration and how many bytes per pixel does it use?

  • ❌ RGB_565 β€” 2 bytes per pixel
  • βœ… ARGB_8888 β€” 4 bytes per pixel
  • ❌ ARGB_4444 β€” 2 bytes per pixel
  • ❌ HARDWARE β€” 0 bytes per pixel

The default Bitmap.Config.ARGB_8888 uses 4 bytes per pixel β€” one byte each for alpha, red, green, and blue channels. A 1080x1920 image in this format consumes about 8.3MB of heap memory.

What is the most common source of memory leaks in Android?

  • ❌ Native memory allocations
  • ❌ Large bitmap loading
  • βœ… Holding references to Activities after they are destroyed
  • ❌ Using too many Fragments

Activities reference their entire view hierarchy, which can be megabytes of memory. When a reference to a destroyed Activity is held by a singleton, static field, or long-lived coroutine, the entire Activity and its views cannot be garbage collected.

When does the GC collect SoftReference objects?

  • ❌ Immediately, same as WeakReference
  • ❌ Never β€” SoftReferences are permanent
  • βœ… Only when the system is running low on memory
  • ❌ After a fixed timeout period

SoftReferences are collected only when the JVM needs memory. The GC tries to keep soft-referenced objects as long as possible, making them suitable for caches where you want objects to survive as long as memory allows.

Why should you avoid allocating objects inside onDraw()?

  • ❌ It causes compile errors
  • ❌ Canvas doesn’t support object creation
  • βœ… onDraw() runs 60+ times per second, creating excessive GC pressure
  • ❌ Objects created in onDraw() can’t be garbage collected

onDraw() is called every frame β€” 60 or more times per second. Allocating objects in this method creates thousands of short-lived objects per minute, increasing GC frequency and causing jank from GC pauses.

Coding Challenge: Memory-Efficient Image Cache

Build a two-level image cache (memory + disk) with configurable size limits, LRU eviction, and memory pressure handling.

// Challenge: Implement a TwoLevelImageCache that:
// 1. Stores bitmaps in an LruCache (memory) with configurable size
// 2. Falls back to disk cache on memory miss
// 3. Responds to system memory pressure by evicting entries
// 4. Tracks hit/miss statistics
// 5. Supports async loading with coroutines

Solution

class TwoLevelImageCache(
    context: Context,
    memoryCacheSizeBytes: Int = calculateDefaultMemoryCacheSize(),
    private val diskCacheDir: File = File(context.cacheDir, "images"),
    private val diskCacheMaxBytes: Long = 50L * 1024 * 1024
) {
    private var memoryHits = 0L
    private var diskHits = 0L
    private var misses = 0L

    private val memoryCache = object : LruCache<String, Bitmap>(memoryCacheSizeBytes) {
        override fun sizeOf(key: String, value: Bitmap): Int = value.byteCount
    }

    init {
        diskCacheDir.mkdirs()
    }

    fun get(key: String): Bitmap? {
        // Level 1: Memory cache
        memoryCache.get(key)?.let {
            memoryHits++
            return it
        }

        // Level 2: Disk cache
        val diskFile = File(diskCacheDir, key.hashCode().toString())
        if (diskFile.exists()) {
            val bitmap = BitmapFactory.decodeFile(diskFile.absolutePath)
            if (bitmap != null) {
                diskHits++
                memoryCache.put(key, bitmap) // Promote to memory
                return bitmap
            }
        }

        misses++
        return null
    }

    suspend fun getOrLoad(
        key: String,
        loader: suspend () -> Bitmap
    ): Bitmap = withContext(Dispatchers.IO) {
        get(key) ?: loader().also { bitmap -> put(key, bitmap) }
    }

    fun put(key: String, bitmap: Bitmap) {
        memoryCache.put(key, bitmap)

        // Write to disk asynchronously
        CoroutineScope(Dispatchers.IO).launch {
            val diskFile = File(diskCacheDir, key.hashCode().toString())
            diskFile.outputStream().use { out ->
                bitmap.compress(Bitmap.CompressFormat.PNG, 90, out)
            }
            trimDiskCache()
        }
    }

    fun onTrimMemory(level: Int) {
        when {
            level >= ComponentCallbacks2.TRIM_MEMORY_MODERATE -> {
                memoryCache.evictAll()
            }
            level >= ComponentCallbacks2.TRIM_MEMORY_RUNNING_LOW -> {
                memoryCache.trimToSize(memoryCache.maxSize() / 2)
            }
            level >= ComponentCallbacks2.TRIM_MEMORY_BACKGROUND -> {
                memoryCache.trimToSize(memoryCache.maxSize() / 4)
            }
        }
    }

    private fun trimDiskCache() {
        val files = diskCacheDir.listFiles() ?: return
        val totalSize = files.sumOf { it.length() }
        if (totalSize > diskCacheMaxBytes) {
            files.sortedBy { it.lastModified() }
                .take(files.size / 4)
                .forEach { it.delete() }
        }
    }

    fun getStats(): String = buildString {
        val total = memoryHits + diskHits + misses
        appendLine("Cache Stats (total: $total)")
        appendLine("  Memory hits: $memoryHits (${percent(memoryHits, total)}%)")
        appendLine("  Disk hits: $diskHits (${percent(diskHits, total)}%)")
        appendLine("  Misses: $misses (${percent(misses, total)}%)")
        appendLine("  Memory size: ${memoryCache.size() / 1024}KB / ${memoryCache.maxSize() / 1024}KB")
    }

    private fun percent(value: Long, total: Long): Long =
        if (total > 0) value * 100 / total else 0

    companion object {
        private fun calculateDefaultMemoryCacheSize(): Int {
            val maxMemory = Runtime.getRuntime().maxMemory() / 1024
            return (maxMemory / 8).toInt() // 1/8th of available memory
        }
    }
}

Module 4: Rendering Performance

Rendering is where performance meets perception. Users don’t measure startup time with a stopwatch β€” but they feel every dropped frame during a scroll, every stutter in an animation, every lag when transitioning between screens. Android targets 60fps (16.67ms per frame) on most devices and 90-120fps on modern flagships, giving you even less time per frame. This module teaches you to understand the rendering pipeline and eliminate every source of jank.

Lesson 4.1: The Android Rendering Pipeline

Understanding the rendering pipeline is essential because different types of jank require different fixes. The pipeline has distinct phases, and each phase has a time budget. If any single phase exceeds its budget, the frame drops.

The pipeline begins when Choreographer signals that a new frame should be produced (VSYNC signal). First, input events are processed β€” touch events, keyboard events. Then animations are updated. Then the traversal phase runs: measure() calculates each View’s size, layout() positions each View within its parent, and draw() renders each View to a display list (a set of GPU commands). Finally, the RenderThread sends the display list to the GPU for actual pixel rendering.

// Monitoring frame timing with FrameMetrics (API 24+)
class PerformanceMonitorActivity : AppCompatActivity() {

    private val frameMetricsListener = Window.OnFrameMetricsAvailableListener {
        _, frameMetrics, _ ->
        val totalDurationMs = frameMetrics.getMetric(
            FrameMetrics.TOTAL_DURATION
        ) / 1_000_000.0 // Convert nanos to ms

        val layoutMeasureDuration = frameMetrics.getMetric(
            FrameMetrics.LAYOUT_MEASURE_DURATION
        ) / 1_000_000.0

        val drawDuration = frameMetrics.getMetric(
            FrameMetrics.DRAW_DURATION
        ) / 1_000_000.0

        val syncDuration = frameMetrics.getMetric(
            FrameMetrics.SYNC_DURATION
        ) / 1_000_000.0

        if (totalDurationMs > 16.67) {
            Log.w("FrameMetrics", buildString {
                append("Slow frame: ${totalDurationMs.format(2)}ms")
                append(" | Layout: ${layoutMeasureDuration.format(2)}ms")
                append(" | Draw: ${drawDuration.format(2)}ms")
                append(" | Sync: ${syncDuration.format(2)}ms")
            })
        }
    }

    override fun onResume() {
        super.onResume()
        window.addOnFrameMetricsAvailableListener(
            frameMetricsListener,
            Handler(Looper.getMainLooper())
        )
    }

    override fun onPause() {
        super.onPause()
        window.removeOnFrameMetricsAvailableListener(frameMetricsListener)
    }

    private fun Double.format(digits: Int) = "%.${digits}f".format(this)
}

The most common rendering bottleneck is the layout/measure phase. Deep view hierarchies cause exponential measurement cost because some ViewGroups (like RelativeLayout) measure their children multiple times. Each level of nesting multiplies the number of measure passes. A RelativeLayout inside a RelativeLayout inside a LinearLayout with weights can trigger 8+ measure passes for leaf views. This is why flat layouts using ConstraintLayout perform dramatically better.

Overdraw is the second major rendering cost. Overdraw occurs when the same pixel is drawn multiple times in a single frame. Drawing a white background, then a card with a white background on top of it, then text on the card means some pixels are drawn three times. Each overdraw wastes GPU fillrate. Enable Developer Options β†’ Debug GPU Overdraw to visualize overdraw β€” blue means 1x overdraw (acceptable), green means 2x, light red means 3x, and dark red means 4x+ (problematic).

// ❌ Unnecessary overdraw β€” window background + layout background
// <FrameLayout android:background="@color/white">
//     <CardView android:background="@color/white">
//         <TextView android:text="Hello" />
//     </CardView>
// </FrameLayout>

// βœ… Remove redundant backgrounds
// Set window background to match content and remove child backgrounds
// In theme: <item name="android:windowBackground">@color/white</item>
// <FrameLayout> <!-- No background needed, inherits window -->
//     <CardView> <!-- CardView has its own elevation drawing -->
//         <TextView android:text="Hello" />
//     </CardView>
// </FrameLayout>

// Programmatically remove window background when not needed
class MainActivity : AppCompatActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)

        // Remove window background if your layout covers full screen
        window.setBackgroundDrawable(null)
    }
}

Key takeaway: The rendering pipeline processes input, animations, measure, layout, draw, and GPU sync β€” each on a 16.67ms budget. Flatten view hierarchies to reduce measure passes and eliminate overdraw by removing redundant backgrounds. Monitor with FrameMetrics API and GPU overdraw visualization.

Lesson 4.2: Layout Optimization

Layout performance is determined by two factors: hierarchy depth and measurement complexity. Every View in your hierarchy must be measured and laid out every frame (if anything changes). The cost grows with depth because parent ViewGroups often measure children multiple times, and each level multiplies that cost.

ConstraintLayout was designed specifically to solve the nested layout problem. It achieves flat hierarchies by allowing you to express complex relationships between views using constraints instead of nesting. A layout that requires 4 levels of LinearLayout nesting can often be expressed as a single ConstraintLayout. The measurement algorithm is O(n) in the number of constraints, not exponential like nested weighted LinearLayout.

// ❌ Deep nesting β€” exponential measure cost
// <LinearLayout orientation="vertical">
//     <LinearLayout orientation="horizontal">
//         <ImageView />
//         <LinearLayout orientation="vertical">
//             <TextView /> <!-- Name -->
//             <LinearLayout orientation="horizontal">
//                 <TextView /> <!-- Date -->
//                 <TextView /> <!-- Time -->
//             </LinearLayout>
//         </LinearLayout>
//     </LinearLayout>
// </LinearLayout>

// βœ… Flat with ConstraintLayout β€” linear measure cost
// <ConstraintLayout>
//     <ImageView
//         app:layout_constraintStart_toStartOf="parent"
//         app:layout_constraintTop_toTopOf="parent" />
//     <TextView  <!-- Name -->
//         app:layout_constraintStart_toEndOf="@id/image"
//         app:layout_constraintTop_toTopOf="parent" />
//     <TextView  <!-- Date -->
//         app:layout_constraintStart_toEndOf="@id/image"
//         app:layout_constraintTop_toBottomOf="@id/name" />
//     <TextView  <!-- Time -->
//         app:layout_constraintStart_toEndOf="@id/date"
//         app:layout_constraintTop_toBottomOf="@id/name" />
// </ConstraintLayout>

// Use Layout Inspector to analyze hierarchy in real-time
// Android Studio β†’ Layout Inspector β†’ Select running process
// Examine: view count, hierarchy depth, measure/layout time per view

ViewStub is an invisible, zero-sized View that lazily inflates a layout when you need it. Use it for error states, empty states, detail sections, and any UI that isn’t visible on initial display. The ViewStub itself costs almost nothing β€” it has no drawing code, no measurement cost, and minimal memory footprint. When you call inflate() or set its visibility to VISIBLE, it inflates the specified layout and replaces itself in the parent.

merge tags eliminate redundant ViewGroup layers when including layouts. Without merge, including a layout file adds the root ViewGroup of the included file as an extra layer. With merge, the children are added directly to the including parent, eliminating one level of nesting.

// ViewStub usage for deferred layouts
class ProductDetailActivity : AppCompatActivity() {

    private var reviewsInflated = false

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_product_detail)

        // Product info loads immediately β€” this is the critical path
        displayProductInfo(product)

        // Reviews section inflates only when user scrolls to it
        val scrollView = findViewById<NestedScrollView>(R.id.scroll_view)
        scrollView.setOnScrollChangeListener { _, _, scrollY, _, _ ->
            if (!reviewsInflated && scrollY > reviewsThreshold) {
                val stub = findViewById<ViewStub>(R.id.reviews_stub)
                val reviewsView = stub.inflate()
                populateReviews(reviewsView)
                reviewsInflated = true
            }
        }
    }
}

For performance-critical custom layouts, consider writing a custom ViewGroup. The default onMeasure() and onLayout() implementations in standard ViewGroups are general-purpose. A custom ViewGroup that knows its exact layout rules can measure and position children in a single pass with minimal allocation.

Key takeaway: Use ConstraintLayout for flat hierarchies with O(n) measurement. Use ViewStub to defer inflation of non-essential UI. Eliminate redundant nesting with merge tags. For performance-critical layouts, consider custom ViewGroups that minimize measure passes.

Lesson 4.3: RecyclerView Performance

RecyclerView is the cornerstone of list performance on Android. Unlike ListView which creates a View for every visible item plus a buffer, RecyclerView creates only enough ViewHolders to fill the screen plus a small cache, then recycles them as the user scrolls. But getting maximum performance from RecyclerView requires understanding its internal caching and binding mechanism.

RecyclerView uses a four-level caching system. The scrap cache holds ViewHolders that were just detached during the current layout pass β€” they’re immediately available without rebinding. The attached scrap is for items still in the adapter’s data set. The cache (default size 2) holds recently scrolled-off ViewHolders by position β€” if the user scrolls back, they’re returned without rebinding. The RecycledViewPool holds ViewHolders by type β€” they require rebinding but avoid inflation. Increasing cache sizes helps for patterns like scroll-bounce, but uses more memory.

// RecyclerView performance configuration
class FeedFragment : Fragment() {

    override fun onViewCreated(view: View, savedInstanceState: Bundle?) {
        val recyclerView = view.findViewById<RecyclerView>(R.id.feed_list)

        recyclerView.apply {
            // Increase cache size for smoother reverse scrolling
            setItemViewCacheSize(10)

            // Share view pool across multiple RecyclerViews
            recycledViewPool = sharedViewPool

            // Fixed size optimization β€” skip unnecessary measure passes
            setHasFixedSize(true)

            // Prefetch items during scroll deceleration
            (layoutManager as LinearLayoutManager).apply {
                initialPrefetchItemCount = 4
            }
        }
    }

    companion object {
        // Share across RecyclerViews that use the same view types
        val sharedViewPool = RecyclerView.RecycledViewPool().apply {
            setMaxRecycledViews(VIEW_TYPE_CARD, 15)
            setMaxRecycledViews(VIEW_TYPE_HEADER, 5)
        }
    }
}

The biggest RecyclerView performance mistake is doing expensive work in onBindViewHolder(). This method is called on the main thread during scrolling. If binding takes more than 1-2ms per item, you’ll drop frames. Pre-compute everything possible in the background: format dates, build display strings, calculate layout dimensions. The ViewHolder should just assign pre-computed values to Views.

// ❌ Expensive operations in onBindViewHolder
class SlowAdapter : RecyclerView.Adapter<SlowAdapter.ViewHolder>() {
    override fun onBindViewHolder(holder: ViewHolder, position: Int) {
        val item = items[position]

        // ❌ Date formatting on main thread
        val dateFormat = SimpleDateFormat("MMM dd, yyyy", Locale.getDefault())
        holder.dateText.text = dateFormat.format(item.date)

        // ❌ Image loading synchronously
        val bitmap = BitmapFactory.decodeFile(item.imagePath)
        holder.image.setImageBitmap(bitmap)

        // ❌ Complex text processing
        holder.description.text = HtmlCompat.fromHtml(
            item.htmlDescription, HtmlCompat.FROM_HTML_MODE_COMPACT
        )
    }
}

// βœ… Minimal work in onBindViewHolder
class FastAdapter : ListAdapter<UiItem, FastAdapter.ViewHolder>(UiItemDiff) {

    override fun onBindViewHolder(holder: ViewHolder, position: Int) {
        val item = getItem(position)

        // Pre-computed values β€” just assign
        holder.dateText.text = item.formattedDate
        holder.description.text = item.processedDescription

        // Async image loading
        Coil.imageLoader(holder.itemView.context)
            .enqueue(
                ImageRequest.Builder(holder.itemView.context)
                    .data(item.imageUrl)
                    .target(holder.image)
                    .size(THUMBNAIL_SIZE)
                    .build()
            )
    }
}

// Pre-compute in the mapping layer
fun Post.toUiItem(): UiItem {
    val dateFormatter = DateTimeFormatter.ofPattern("MMM dd, yyyy")
    return UiItem(
        id = id,
        formattedDate = dateFormatter.format(createdAt),
        processedDescription = HtmlCompat.fromHtml(
            htmlDescription, HtmlCompat.FROM_HTML_MODE_COMPACT
        ),
        imageUrl = thumbnailUrl
    )
}

DiffUtil is essential for efficient list updates. Instead of calling notifyDataSetChanged() (which rebinds every visible item), DiffUtil calculates the minimum set of changes between the old and new lists and dispatches only the necessary insert, remove, and change operations. ListAdapter wraps DiffUtil with background computation β€” always use it instead of raw RecyclerView.Adapter.

Key takeaway: Configure RecyclerView cache sizes for your scrolling pattern. Do minimal work in onBindViewHolder() β€” pre-compute everything in the background. Always use ListAdapter with DiffUtil for efficient list updates instead of notifyDataSetChanged().

Lesson 4.4: Custom Drawing Performance

Custom Views with onDraw() override give you complete control over rendering but also complete responsibility for performance. Every inefficiency in onDraw() multiplies by 60 frames per second. The rules are strict: no allocations, no complex calculations, no I/O, and minimal method calls.

The most critical rule is pre-allocating all drawing objects β€” Paint, Path, RectF, Matrix β€” as class fields and reusing them in every onDraw() call. The Android Lint tool specifically warns about allocations in onDraw() because this is such a common source of jank. Reset mutable objects (like Path.reset()) instead of creating new ones.

class ProgressRingView(
    context: Context,
    attrs: AttributeSet? = null
) : View(context, attrs) {

    // Pre-allocate all drawing objects
    private val backgroundPaint = Paint(Paint.ANTI_ALIAS_FLAG).apply {
        style = Paint.Style.STROKE
        strokeWidth = 12f.dpToPx()
        color = Color.parseColor("#E0E0E0")
        strokeCap = Paint.Cap.ROUND
    }

    private val progressPaint = Paint(Paint.ANTI_ALIAS_FLAG).apply {
        style = Paint.Style.STROKE
        strokeWidth = 12f.dpToPx()
        color = Color.parseColor("#2196F3")
        strokeCap = Paint.Cap.ROUND
    }

    private val textPaint = Paint(Paint.ANTI_ALIAS_FLAG).apply {
        textSize = 32f.dpToPx()
        textAlign = Paint.Align.CENTER
        color = Color.BLACK
    }

    private val arcRect = RectF()
    private val textBounds = Rect()

    var progress: Float = 0f
        set(value) {
            field = value.coerceIn(0f, 1f)
            invalidate() // Only invalidate when value changes
        }

    override fun onSizeChanged(w: Int, h: Int, oldw: Int, oldh: Int) {
        val padding = progressPaint.strokeWidth / 2
        arcRect.set(padding, padding, w - padding, h - padding)
    }

    override fun onDraw(canvas: Canvas) {
        // Background ring
        canvas.drawArc(arcRect, 0f, 360f, false, backgroundPaint)

        // Progress arc
        val sweepAngle = progress * 360f
        canvas.drawArc(arcRect, -90f, sweepAngle, false, progressPaint)

        // Percentage text
        val text = "${(progress * 100).toInt()}%"
        textPaint.getTextBounds(text, 0, text.length, textBounds)
        canvas.drawText(
            text,
            arcRect.centerX(),
            arcRect.centerY() + textBounds.height() / 2f,
            textPaint
        )
    }

    private fun Float.dpToPx(): Float =
        this * resources.displayMetrics.density
}

Hardware acceleration is enabled by default on API 14+ and offloads drawing operations to the GPU. This makes most drawing operations much faster, but some Canvas operations aren’t supported in hardware-accelerated mode. If you use unsupported operations, the system falls back to software rendering for that View β€” much slower. Check the documentation for unsupported operations and avoid them, or set setLayerType(LAYER_TYPE_SOFTWARE, null) only on the specific View that needs it.

For complex animations, use hardware layers. A hardware layer renders the View to an off-screen texture on the GPU. During animation, only the texture is transformed (translated, rotated, scaled, alpha changed) without re-rendering the View’s content. This is dramatically faster for Views with complex content. Set the layer before animation starts and remove it after animation ends β€” keeping a hardware layer active permanently wastes GPU memory.

// Using hardware layers for smooth animation
fun animateCardExit(cardView: View) {
    // Enable hardware layer before animation
    cardView.setLayerType(View.LAYER_TYPE_HARDWARE, null)

    cardView.animate()
        .translationX(cardView.width.toFloat())
        .alpha(0f)
        .setDuration(300)
        .setInterpolator(AccelerateInterpolator())
        .withEndAction {
            // Remove hardware layer after animation
            cardView.setLayerType(View.LAYER_TYPE_NONE, null)
            cardView.visibility = View.GONE
        }
        .start()
}

Key takeaway: Pre-allocate all drawing objects as class fields β€” never allocate inside onDraw(). Use hardware layers during animations to avoid re-rendering complex Views every frame. Call invalidate() only when the visual state actually changes.

Lesson 4.5: Animation Performance

Smooth animation requires hitting every frame deadline without exception. A single dropped frame during an animation is far more noticeable than during static content display. The key to smooth animation is ensuring that the animation system drives updates efficiently and that per-frame work stays within budget.

ValueAnimator and ObjectAnimator are the foundation of the Android animation system. They use Choreographer to synchronize updates with VSYNC, ensuring animations are driven by the display refresh rate rather than arbitrary timers. Never use Timer, Thread.sleep(), or postDelayed() for animations β€” they aren’t synchronized with VSYNC and produce inconsistent frame timing.

// βœ… Smooth animation with ObjectAnimator
fun pulseAnimation(view: View) {
    val scaleX = PropertyValuesHolder.ofFloat(View.SCALE_X, 1f, 1.1f, 1f)
    val scaleY = PropertyValuesHolder.ofFloat(View.SCALE_Y, 1f, 1.1f, 1f)

    ObjectAnimator.ofPropertyValuesHolder(view, scaleX, scaleY).apply {
        duration = 600
        interpolator = OvershootInterpolator()
        repeatCount = ValueAnimator.INFINITE
        repeatMode = ValueAnimator.RESTART
        start()
    }
}

// βœ… Complex animation with AnimatorSet
fun enterAnimation(
    title: View,
    subtitle: View,
    content: View,
    button: View
) {
    val views = listOf(title, subtitle, content, button)
    views.forEach { it.alpha = 0f; it.translationY = 50f.dpToPx() }

    val animators = views.mapIndexed { index, view ->
        AnimatorSet().apply {
            playTogether(
                ObjectAnimator.ofFloat(view, View.ALPHA, 0f, 1f),
                ObjectAnimator.ofFloat(view, View.TRANSLATION_Y, 50f.dpToPx(), 0f)
            )
            startDelay = index * 100L
            duration = 400
            interpolator = DecelerateInterpolator(2f)
        }
    }

    AnimatorSet().apply {
        playTogether(animators)
        start()
    }
}

For RecyclerView item animations, ItemAnimator handles insert, remove, move, and change animations. The default DefaultItemAnimator provides fade and translate animations. For custom animations, extend SimpleItemAnimator and implement the animation for each operation. The critical performance rule is keeping animation duration short (200-300ms) and using simple transformations (alpha, translation, scale) that can be handled entirely by the GPU through hardware layers.

Avoid animating properties that trigger layout passes. Animating width, height, padding, or margin forces a layout pass every frame β€” measuring and positioning all children at 60fps. Instead, animate translationX, translationY, scaleX, scaleY, rotation, and alpha β€” these are render-node properties that only affect the GPU transform, not the layout.

// ❌ Animating layout properties β€” triggers layout every frame
ObjectAnimator.ofInt(view, "width", startWidth, endWidth).apply {
    duration = 300
    addUpdateListener { view.requestLayout() } // Layout every frame!
    start()
}

// βœ… Animating render properties β€” GPU only, no layout pass
ObjectAnimator.ofFloat(view, View.SCALE_X, 1f, 0.5f).apply {
    duration = 300
    start()
}

// βœ… If you must animate size, use scaleX/scaleY with pivotX/pivotY
fun collapseAnimation(view: View) {
    view.pivotY = 0f // Scale from top
    ObjectAnimator.ofFloat(view, View.SCALE_Y, 1f, 0f).apply {
        duration = 300
        interpolator = FastOutSlowInInterpolator()
        doOnEnd { view.visibility = View.GONE }
        start()
    }
}

Key takeaway: Animate only render-node properties (translation, scale, rotation, alpha) β€” never layout properties (width, height, margin, padding). Use hardware layers for complex Views during animation. Keep animation durations short (200-300ms) for responsive feel.

Lesson 4.6: GPU Rendering and Profile GPU Rendering

Profile GPU Rendering is a developer option that shows a real-time bar chart of frame rendering times. Each bar represents one frame, and each color segment represents a phase of the rendering pipeline. The green horizontal line represents the 16.67ms budget β€” bars extending above this line indicate dropped frames.

The color segments (from bottom to top) represent: input handling (orange), animation (light blue), measure and layout (blue/purple), draw (dark blue), sync and upload (light green), command issue (dark green), and swap buffers (yellow). By identifying which segment is tallest, you know exactly which phase of the rendering pipeline to optimize.

// StrictMode helps detect rendering violations in development
class DebugApplication : Application() {
    override fun onCreate() {
        super.onCreate()

        if (BuildConfig.DEBUG) {
            StrictMode.setThreadPolicy(
                StrictMode.ThreadPolicy.Builder()
                    .detectDiskReads()
                    .detectDiskWrites()
                    .detectNetwork()
                    .detectCustomSlowCalls()
                    .penaltyLog()
                    .penaltyFlashScreen() // Red flash on violations
                    .build()
            )

            StrictMode.setVmPolicy(
                StrictMode.VmPolicy.Builder()
                    .detectLeakedSqlLiteObjects()
                    .detectLeakedClosableObjects()
                    .detectActivityLeaks()
                    .penaltyLog()
                    .build()
            )
        }
    }
}

// JankStats API β€” detect jank in production
class JankMonitorActivity : AppCompatActivity() {

    private lateinit var jankStats: JankStats

    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        setContentView(R.layout.activity_main)

        jankStats = JankStats.createAndTrack(window) { frameData ->
            if (frameData.isJank) {
                val durationMs = frameData.frameDurationUiNanos / 1_000_000.0
                Log.w("Jank", buildString {
                    append("Jank detected: ${durationMs}ms")
                    frameData.states.forEach { state ->
                        append(" | ${state.key}=${state.value}")
                    }
                })

                // Report to analytics
                analytics.trackJank(durationMs, frameData.states)
            }
        }
    }

    override fun onResume() {
        super.onResume()
        jankStats.isTrackingEnabled = true
    }

    override fun onPause() {
        super.onPause()
        jankStats.isTrackingEnabled = false
    }
}

GPU overdraw visualization (Developer Options β†’ Debug GPU Overdraw) shows how many times each pixel is drawn in a frame. True color shows no overdraw. Blue shows 1x overdraw (drawn twice). Green shows 2x (drawn three times). Light red shows 3x. Dark red shows 4x or more. Your goal is to keep most of the screen at blue or less. Areas of red overdraw are opportunities to remove unnecessary backgrounds.

When profiling GPU rendering, run your app on a mid-range device, not a flagship. A Pixel 7 Pro has so much GPU power that overdraw and rendering inefficiencies are invisible. A Galaxy A series phone with a weaker GPU shows these problems clearly. Always test on the weakest device your users are likely to have β€” that’s where performance matters most.

Key takeaway: Profile GPU Rendering shows per-frame timing broken down by pipeline phase. Debug GPU Overdraw reveals redundant pixel drawing. Use JankStats API to detect and report jank in production. Test on mid-range devices where rendering bottlenecks are most visible.

Quiz: Rendering Performance

What happens when onDraw() takes more than 16.67ms?

  • ❌ The view is not rendered
  • ❌ The system crashes
  • βœ… The frame is dropped, causing visible jank
  • ❌ The system automatically optimizes it

At 60fps, each frame has a 16.67ms budget. If onDraw() or any other rendering phase exceeds this budget, the frame cannot be completed in time and is skipped, causing visible stuttering (jank) in the UI.

Why should you avoid animating width and height properties?

  • ❌ These properties cannot be animated
  • ❌ The animation API doesn’t support integer properties
  • βœ… Changing layout properties triggers a measure and layout pass every frame
  • ❌ Width and height animations are slower on GPU

Animating layout properties forces the entire view hierarchy to be remeasured and repositioned every frame. Use render properties (translationX/Y, scaleX/Y, alpha, rotation) instead β€” these only affect the GPU transform and skip the layout phase entirely.

What is the purpose of ViewStub?

  • ❌ It makes views invisible
  • βœ… It lazily inflates a layout only when made visible, saving initial inflation cost
  • ❌ It replaces ConstraintLayout for flat layouts
  • ❌ It prevents overdraw

ViewStub is a zero-size, invisible View that inflates its target layout only when inflate() is called or its visibility is set to VISIBLE. Before that, it has essentially zero cost β€” no drawing, no measurement, minimal memory.

What does the green line represent in Profile GPU Rendering?

  • ❌ The maximum GPU capability
  • βœ… The 16.67ms frame budget for 60fps
  • ❌ The average frame time
  • ❌ The GPU memory limit

The green line marks the 16.67ms threshold. Bars below this line represent frames rendered within budget. Bars above this line represent dropped frames where the rendering took too long.

Coding Challenge: JankDetector

Build a jank detection system that monitors frame rendering times, identifies jank patterns, and reports the most common causes.

// Challenge: Build a JankDetector that:
// 1. Monitors frame times using Choreographer.FrameCallback
// 2. Detects dropped frames (> 16.67ms between callbacks)
// 3. Tracks jank frequency and severity
// 4. Identifies jank patterns (e.g., jank during scrolling vs idle)
// 5. Reports top jank causes with timestamps

Solution

class JankDetector(
    private val targetFps: Int = 60,
    private val onJankDetected: (JankEvent) -> Unit = {}
) : Choreographer.FrameCallback {

    private val framebudgetNs = 1_000_000_000L / targetFps
    private var previousFrameNs = 0L
    private var isRunning = false
    private val jankHistory = mutableListOf<JankEvent>()
    private var currentState = "idle"

    data class JankEvent(
        val timestamp: Long,
        val durationMs: Double,
        val droppedFrames: Int,
        val state: String
    )

    data class JankReport(
        val totalFrames: Long,
        val jankyFrames: Int,
        val jankPercentage: Double,
        val averageJankMs: Double,
        val worstJankMs: Double,
        val jankByState: Map<String, Int>
    )

    private var totalFrames = 0L

    fun start() {
        if (isRunning) return
        isRunning = true
        previousFrameNs = 0L
        Choreographer.getInstance().postFrameCallback(this)
    }

    fun stop() {
        isRunning = false
        Choreographer.getInstance().removeFrameCallback(this)
    }

    fun setState(state: String) {
        currentState = state
    }

    override fun doFrame(frameTimeNanos: Long) {
        if (!isRunning) return

        if (previousFrameNs > 0) {
            totalFrames++
            val frameDurationNs = frameTimeNanos - previousFrameNs
            val droppedFrames = ((frameDurationNs / framebudgetNs) - 1)
                .coerceAtLeast(0).toInt()

            if (droppedFrames > 0) {
                val durationMs = frameDurationNs / 1_000_000.0
                val event = JankEvent(
                    timestamp = System.currentTimeMillis(),
                    durationMs = durationMs,
                    droppedFrames = droppedFrames,
                    state = currentState
                )
                jankHistory.add(event)
                onJankDetected(event)
            }
        }

        previousFrameNs = frameTimeNanos
        Choreographer.getInstance().postFrameCallback(this)
    }

    fun getReport(): JankReport {
        val jankyFrames = jankHistory.size
        return JankReport(
            totalFrames = totalFrames,
            jankyFrames = jankyFrames,
            jankPercentage = if (totalFrames > 0)
                jankyFrames * 100.0 / totalFrames else 0.0,
            averageJankMs = if (jankyFrames > 0)
                jankHistory.map { it.durationMs }.average() else 0.0,
            worstJankMs = jankHistory.maxOfOrNull { it.durationMs } ?: 0.0,
            jankByState = jankHistory.groupBy { it.state }
                .mapValues { it.value.size }
        )
    }

    fun clearHistory() {
        jankHistory.clear()
        totalFrames = 0
    }
}

Module 5: Jetpack Compose Performance

Compose fundamentally changes the performance landscape. Instead of inflating XML layouts and mutating Views, Compose uses a declarative model where UI is described as a function of state. This eliminates entire categories of performance problems (double-taxation in measure, unnecessary view inflation) while introducing new ones (unnecessary recomposition, unstable parameters). Understanding Compose’s execution model is essential for writing performant Compose code.

Lesson 5.1: Composition, Layout, and Drawing Phases

Compose’s rendering pipeline has three distinct phases, and understanding which phase your code runs in determines how you optimize it. Composition reads state and produces a tree of UI nodes. Layout measures and positions those nodes. Drawing renders pixels to the screen. Each phase can be skipped independently β€” if only drawing data changes, composition and layout are skipped entirely.

The key optimization principle is reading state in the latest possible phase. If you read a state value during composition, any change to that state triggers recomposition of the entire composable. If you read it during layout (using Modifier.layout or Layout), only layout re-runs. If you read it during drawing (using Modifier.drawBehind or Canvas), only drawing re-runs. Since drawing is the cheapest phase, deferring state reads to drawing maximizes performance.

// ❌ Reading state during composition β€” recomposes on every change
@Composable
fun AnimatedBox(offset: State<Float>) {
    // Reading offset.value here means this composable recomposes
    // every time offset changes (every animation frame)
    Box(
        modifier = Modifier
            .offset(x = offset.value.dp) // Read during composition
            .size(50.dp)
            .background(Color.Blue)
    )
}

// βœ… Deferring state read to layout phase β€” skips recomposition
@Composable
fun AnimatedBox(offset: State<Float>) {
    Box(
        modifier = Modifier
            .offset { // Lambda β€” read deferred to layout phase
                IntOffset(offset.value.roundToInt(), 0)
            }
            .size(50.dp)
            .background(Color.Blue)
    )
}

// βœ… Reading state in draw phase β€” cheapest option
@Composable
fun AnimatedCircle(color: State<Color>) {
    Canvas(modifier = Modifier.size(100.dp)) {
        // Color is read during draw phase only
        drawCircle(color = color.value)
    }
}

Recomposition is Compose’s mechanism for updating UI when state changes. When a State<T> value changes, Compose identifies every composable that reads that state and re-executes them. This is called recomposition. Compose is intelligent about what to recompose β€” it tracks state reads at a fine granularity and only recomposes the smallest necessary scope. But developers can accidentally widen the recomposition scope by reading state too early or too broadly.

The Compose compiler adds recomposition logic during compilation. It wraps composable functions with $composer.startRestartGroup() / $composer.endRestartGroup() calls that enable Compose to track which composables need re-execution. Composables that have not changed their inputs are skipped entirely β€” the compiler inserts equality checks on all parameters to enable this skipping.

// Layout Inspector in Android Studio shows recomposition counts
// Enable: Layout Inspector β†’ Toggle "Show Recomposition Counts"
// Each composable shows how many times it recomposed and skipped

// Use Composition Local to pass data without recomposition propagation
val LocalThemeColors = compositionLocalOf { defaultColors }

@Composable
fun ThemedApp(isDarkMode: Boolean) {
    val colors = if (isDarkMode) darkColors else lightColors

    CompositionLocalProvider(LocalThemeColors provides colors) {
        // Child composables read colors from CompositionLocal
        // They don't recompose when isDarkMode changes β€”
        // only composables that actually READ LocalThemeColors recompose
        AppContent()
    }
}

Key takeaway: Compose has three phases: composition, layout, and drawing. Read state in the latest possible phase to minimize work. Defer animation values to layout or draw phases using lambda-based modifiers like Modifier.offset { } and Modifier.drawBehind { }.

Lesson 5.2: Stability and Recomposition Skipping

Compose’s recomposition skipping is its most powerful performance feature β€” but it only works when the compiler can prove that a composable’s inputs haven’t changed. This proof depends on parameter stability. A stable type is one where Compose can use equality checks to determine if a value has changed. Unstable types force recomposition every time, even if the actual data is identical.

Primitive types (Int, String, Float, Boolean) are always stable. Data classes where all properties are stable types are stable. Mutable collections (List, Map, Set from Kotlin stdlib) are unstable because they’re interfaces that could be backed by mutable implementations. This is the single biggest performance trap in Compose β€” if any parameter of a composable is unstable, that composable can never be skipped.

// ❌ Unstable parameter β€” this composable can NEVER be skipped
@Composable
fun UserList(users: List<User>) { // List is unstable
    LazyColumn {
        items(users) { user ->
            UserRow(user)
        }
    }
}

// βœ… Stable β€” use ImmutableList from kotlinx.collections.immutable
@Composable
fun UserList(users: ImmutableList<User>) { // ImmutableList is stable
    LazyColumn {
        items(users) { user ->
            UserRow(user)
        }
    }
}

// Alternative: annotate with @Immutable
@Immutable
data class UserListState(
    val users: List<User>, // Now treated as stable
    val isLoading: Boolean,
    val error: String?
)

// Or use @Stable for types with identity-based equality
@Stable
class AnimationState {
    var progress by mutableFloatStateOf(0f)
    var isRunning by mutableStateOf(false)
}

You can check your composables’ stability using the Compose Compiler Reports. Add the compiler report flags to your build configuration and the compiler generates a text file listing each composable function, its parameters, and whether each parameter is stable, unstable, or runtime-determined. This report is invaluable for identifying unexpected instability.

// build.gradle.kts β€” enable Compose compiler reports
android {
    composeOptions {
        kotlinCompilerExtensionVersion = "1.5.10"
    }
}

// Add to gradle.properties or build.gradle.kts
// kotlinOptions {
//     freeCompilerArgs += listOf(
//         "-P", "plugin:androidx.compose.compiler.plugins.kotlin:reportsDestination=" +
//             project.buildDir.absolutePath + "/compose_metrics",
//         "-P", "plugin:androidx.compose.compiler.plugins.kotlin:metricsDestination=" +
//             project.buildDir.absolutePath + "/compose_metrics"
//     )
// }

// Example output from composables.txt:
// restartable skippable scheme("[androidx.compose.ui.UiComposable]")
// fun UserRow(
//   stable user: User      ← can be compared, enables skipping
//   stable onClick: Function0<Unit>  ← stable
// )
//
// restartable scheme("[androidx.compose.ui.UiComposable]")
// fun UserList(
//   unstable users: List<User>  ← UNSTABLE! prevents skipping
// )

Lambda stability is another subtle trap. Lambdas that capture mutable local variables or unstable references become unstable themselves. A common pattern is passing a lambda to a child composable β€” if the lambda is recreated on every recomposition (because it captures a changing value), the child composable can’t skip. Use remember to stabilize lambdas or reference stable methods.

// ❌ Lambda recreated every recomposition β€” child can't skip
@Composable
fun ParentScreen(viewModel: MainViewModel) {
    val items by viewModel.items.collectAsStateWithLifecycle()

    ItemList(
        items = items,
        onItemClick = { id -> viewModel.selectItem(id) } // New lambda every time
    )
}

// βœ… Stable method reference β€” child can skip
@Composable
fun ParentScreen(viewModel: MainViewModel) {
    val items by viewModel.items.collectAsStateWithLifecycle()

    ItemList(
        items = items,
        onItemClick = viewModel::selectItem // Stable reference
    )
}

Key takeaway: Compose can skip recomposition only when all parameters are stable. Use ImmutableList instead of List, annotate types with @Immutable or @Stable, and use stable lambda references. Generate Compose Compiler Reports to identify unstable parameters in your composables.

Lesson 5.3: LazyColumn and LazyRow Performance

LazyColumn and LazyRow are Compose’s answer to RecyclerView. They only compose and render items that are currently visible, plus a small buffer for smooth scrolling. But unlike RecyclerView where the framework handles most optimization, LazyColumn performance depends heavily on how you write your composable items.

The most critical optimization is providing stable keys for each item. Without keys, Compose identifies items by their position in the list. If an item is inserted at position 0, every subsequent item appears to have changed position, triggering recomposition of every visible item. With stable keys (using a unique ID), Compose tracks items by identity β€” an insertion causes only the new item to compose, while existing items are simply repositioned.

// ❌ No keys β€” insertion recomposes everything
@Composable
fun MessageList(messages: ImmutableList<Message>) {
    LazyColumn {
        items(messages) { message ->
            MessageRow(message) // All rows recompose on list change
        }
    }
}

// βœ… Stable keys β€” only changed items recompose
@Composable
fun MessageList(messages: ImmutableList<Message>) {
    LazyColumn {
        items(
            items = messages,
            key = { message -> message.id } // Unique stable key
        ) { message ->
            MessageRow(message) // Only new/changed rows compose
        }
    }
}

// βœ… Content types for heterogeneous lists β€” better recycling
@Composable
fun FeedList(feedItems: ImmutableList<FeedItem>) {
    LazyColumn {
        items(
            items = feedItems,
            key = { it.id },
            contentType = { item ->
                when (item) {
                    is FeedItem.Post -> "post"
                    is FeedItem.Ad -> "ad"
                    is FeedItem.Story -> "story"
                }
            }
        ) { item ->
            when (item) {
                is FeedItem.Post -> PostCard(item)
                is FeedItem.Ad -> AdCard(item)
                is FeedItem.Story -> StoryCard(item)
            }
        }
    }
}

The contentType parameter enables Compose to reuse composition nodes between items of the same type β€” similar to view type in RecyclerView. Without content types, Compose can reuse any item slot for any item, which may require completely rebuilding the composition tree. With content types, a post slot is only reused for other posts, making recomposition more efficient.

Avoid putting expensive operations inside lazy item composables. Each item’s composable runs during scrolling, and any expensive operation (formatting dates, parsing HTML, loading images synchronously) causes jank. Pre-compute everything in the ViewModel and pass display-ready data to composables. Image loading should use Coil’s AsyncImage which handles async loading and caching internally.

// ❌ Expensive computation inside lazy items
@Composable
fun LazyListScope.feedItems(items: ImmutableList<RawPost>) {
    items(items, key = { it.id }) { post ->
        // ❌ All of this runs during scrolling
        val formattedDate = remember(post.timestamp) {
            DateFormat.format("MMM dd, yyyy", post.timestamp).toString()
        }
        val htmlContent = remember(post.content) {
            HtmlCompat.fromHtml(post.content, HtmlCompat.FROM_HTML_MODE_COMPACT)
        }

        PostCard(
            title = post.title,
            date = formattedDate,
            content = htmlContent
        )
    }
}

// βœ… Pre-computed display data
data class PostUiModel(
    val id: String,
    val title: String,
    val formattedDate: String,
    val displayContent: AnnotatedString,
    val thumbnailUrl: String
)

@Composable
fun LazyListScope.feedItems(items: ImmutableList<PostUiModel>) {
    items(items, key = { it.id }) { post ->
        PostCard(post) // Just renders pre-computed data
    }
}

Key takeaway: Always provide unique stable keys to items() in LazyColumn/LazyRow. Use contentType for heterogeneous lists. Pre-compute display data in the ViewModel β€” lazy item composables should only render, never compute.

Lesson 5.4: derivedStateOf and State Management

derivedStateOf is a critical optimization tool that creates a state value computed from other state values, but only triggers recomposition when the computed result actually changes. Without derivedStateOf, reading multiple state values means recomposing whenever any input changes. With derivedStateOf, recomposition happens only when the derived value itself is different.

The classic example is a β€œshow scroll-to-top button” that should appear when the user scrolls past the first few items. Without derivedStateOf, you’d read the scroll state directly, triggering recomposition on every single scroll pixel. With derivedStateOf, you compute a boolean β€œshould show button” that only changes when crossing the threshold β€” dramatically reducing recompositions from hundreds to just two (false β†’ true, true β†’ false).

// ❌ Recomposes on every scroll pixel
@Composable
fun ScrollingList(items: ImmutableList<Item>) {
    val listState = rememberLazyListState()

    // This reads scroll position β€” recomposes every frame during scroll
    val showButton = listState.firstVisibleItemIndex > 3

    Box {
        LazyColumn(state = listState) {
            items(items, key = { it.id }) { ItemRow(it) }
        }

        if (showButton) {
            ScrollToTopButton(listState)
        }
    }
}

// βœ… Only recomposes when the derived boolean changes
@Composable
fun ScrollingList(items: ImmutableList<Item>) {
    val listState = rememberLazyListState()

    // derivedStateOf β€” only triggers recomposition when result changes
    val showButton by remember {
        derivedStateOf { listState.firstVisibleItemIndex > 3 }
    }

    Box {
        LazyColumn(state = listState) {
            items(items, key = { it.id }) { ItemRow(it) }
        }

        if (showButton) {
            ScrollToTopButton(listState)
        }
    }
}

snapshotFlow converts Compose state into a Flow, enabling you to use Flow operators like distinctUntilChanged, debounce, and filter on state changes. This is powerful for cases where you need to react to state changes but don’t want to trigger recomposition β€” for example, logging scroll position, saving state to disk, or making API calls based on state changes.

// snapshotFlow β€” react to state changes without recomposition
@Composable
fun SearchScreen(viewModel: SearchViewModel) {
    var query by remember { mutableStateOf("") }

    // Debounced search β€” only searches after user stops typing
    LaunchedEffect(Unit) {
        snapshotFlow { query }
            .debounce(300)
            .distinctUntilChanged()
            .filter { it.length >= 2 }
            .collectLatest { searchQuery ->
                viewModel.search(searchQuery)
            }
    }

    SearchBar(
        query = query,
        onQueryChange = { query = it }
    )
}

// derivedStateOf for computed collections
@Composable
fun FilteredList(
    items: ImmutableList<Item>,
    searchQuery: String
) {
    // Only recomputes when the filtered result actually changes
    val filteredItems by remember(items) {
        derivedStateOf {
            if (searchQuery.isBlank()) items
            else items.filter {
                it.name.contains(searchQuery, ignoreCase = true)
            }.toImmutableList()
        }
    }

    LazyColumn {
        items(filteredItems, key = { it.id }) { item ->
            ItemRow(item)
        }
    }
}

Use derivedStateOf when you’re computing a value from frequently-changing state and the computed value changes less frequently than its inputs. Do not use it for simple state transformations where the output changes just as often as the input β€” the overhead of derivedStateOf (tracking inputs, comparing outputs) isn’t worth it in that case.

Key takeaway: Use derivedStateOf when a computed value changes less frequently than its inputs β€” it eliminates unnecessary recompositions by only triggering when the derived result actually changes. Use snapshotFlow to bridge Compose state into Flow for debouncing, filtering, and side effects.

Lesson 5.5: remember and Computation Caching

remember is Compose’s mechanism for caching values across recompositions. Without remember, every value is recalculated on every recomposition β€” even expensive computations. With remember, the value is calculated once and reused until the composable leaves the composition or its keys change.

The critical distinction is between remember (caches across recompositions, resets when keys change) and rememberSaveable (survives configuration changes and process death). Use remember for computed values, formatting results, and parsed data. Use rememberSaveable for user input, scroll positions, and UI state that should survive rotation.

// ❌ Expensive computation runs every recomposition
@Composable
fun ChartView(data: ImmutableList<DataPoint>) {
    // This runs every time ChartView recomposes β€” even if data hasn't changed
    val processedData = data
        .filter { it.value > 0 }
        .groupBy { it.category }
        .mapValues { it.value.sumOf { point -> point.value } }
        .toSortedMap()

    Canvas(modifier = Modifier.fillMaxSize()) {
        drawChart(processedData)
    }
}

// βœ… Cached with remember β€” only recomputes when data changes
@Composable
fun ChartView(data: ImmutableList<DataPoint>) {
    val processedData = remember(data) {
        data
            .filter { it.value > 0 }
            .groupBy { it.category }
            .mapValues { it.value.sumOf { point -> point.value } }
            .toSortedMap()
    }

    Canvas(modifier = Modifier.fillMaxSize()) {
        drawChart(processedData)
    }
}

// rememberSaveable for state that survives configuration changes
@Composable
fun SettingsScreen() {
    var selectedTab by rememberSaveable { mutableIntStateOf(0) }
    var expandedSection by rememberSaveable { mutableStateOf<String?>(null) }
    var searchQuery by rememberSaveable { mutableStateOf("") }

    // These values survive screen rotation
    Column {
        TabRow(selectedTabIndex = selectedTab) {
            Tab(selected = selectedTab == 0, onClick = { selectedTab = 0 }) {
                Text("General")
            }
            Tab(selected = selectedTab == 1, onClick = { selectedTab = 1 }) {
                Text("Privacy")
            }
        }
    }
}

For complex objects that aren’t primitives or Parcelables, use rememberSaveable with a custom Saver. The Saver converts your object to and from a Bundle-compatible format for process death survival. The listSaver and mapSaver utilities simplify creating savers for common patterns.

// Custom Saver for complex state
data class FilterState(
    val categories: Set<String>,
    val minPrice: Float,
    val maxPrice: Float,
    val sortBy: SortOption
)

val FilterStateSaver = listSaver<FilterState, Any>(
    save = { state ->
        listOf(
            state.categories.toList(),
            state.minPrice,
            state.maxPrice,
            state.sortBy.name
        )
    },
    restore = { list ->
        @Suppress("UNCHECKED_CAST")
        FilterState(
            categories = (list[0] as List<String>).toSet(),
            minPrice = list[1] as Float,
            maxPrice = list[2] as Float,
            sortBy = SortOption.valueOf(list[3] as String)
        )
    }
)

@Composable
fun FilterScreen() {
    var filterState by rememberSaveable(stateSaver = FilterStateSaver) {
        mutableStateOf(FilterState.DEFAULT)
    }
    // filterState survives process death
}

Key takeaway: Use remember to cache expensive computations across recompositions β€” always provide keys that represent the computation’s inputs. Use rememberSaveable for UI state that should survive configuration changes. Never do expensive work directly in a composable body without caching.

Lesson 5.6: Compose Performance Testing

Testing Compose performance requires different tools than View-based testing. The Layout Inspector shows recomposition counts per composable β€” enable β€œShow Recomposition Counts” to see which composables recompose and which are skipped. High recomposition counts on composables that shouldn’t be changing indicate performance problems.

The Compose Compiler Metrics report (covered in Lesson 5.2) is your first line of defense. Run it as part of CI to catch stability regressions. If a previously-stable composable becomes unstable (because someone added an unstable parameter), your CI fails and the team investigates before the regression ships.

// Compose benchmark testing with Macrobenchmark
@RunWith(AndroidJUnit4::class)
class ComposeScrollBenchmark {

    @get:Rule
    val benchmarkRule = MacrobenchmarkRule()

    @Test
    fun scrollFeed() {
        benchmarkRule.measureRepeated(
            packageName = "com.example.myapp",
            metrics = listOf(FrameTimingMetric()),
            compilationMode = CompilationMode.Partial(
                baselineProfileMode = BaselineProfileMode.Require
            ),
            iterations = 5,
            startupMode = StartupMode.WARM
        ) {
            startActivityAndWait()

            // Navigate to feed
            device.findObject(By.text("Feed")).click()
            device.waitForIdle()

            // Scroll test
            val feedList = device.findObject(By.res("feed_lazy_column"))
            repeat(5) {
                feedList.fling(Direction.DOWN)
                device.waitForIdle()
            }
        }
    }
}

// Custom composition tracker for debugging
@Composable
fun TrackRecomposition(tag: String) {
    val recompositionCount = remember { mutableIntStateOf(0) }
    recompositionCount.intValue++

    SideEffect {
        Log.d("Recomposition", "$tag: recomposed ${recompositionCount.intValue} times")
    }
}

// Usage β€” add to suspect composables during development
@Composable
fun SuspectComposable(data: SomeData) {
    TrackRecomposition("SuspectComposable")
    // ... composable content
}

For production monitoring, use the JankStats library with Compose. JankStats tracks frame times and reports jank, including the current composition state (which screen is visible, what interaction is happening). This data helps you correlate jank with specific Compose screens and interactions in your production app.

Benchmarking specific composable functions in isolation requires the createComposeRule() test rule from the Compose testing library. You can measure initial composition time, recomposition time, and rendering time for individual composables. This is useful for comparing implementation alternatives β€” does the new card design recompose more than the old one?

Key takeaway: Use Layout Inspector recomposition counts for development debugging. Run Compose Compiler Metrics in CI to catch stability regressions. Use Macrobenchmark for end-to-end scroll and navigation performance. Track recomposition counts in suspect composables during development.

Quiz: Jetpack Compose Performance

Why is List from Kotlin stdlib considered unstable in Compose?

  • ❌ Lists can’t be compared for equality
  • ❌ Lists are too large to compare
  • βœ… List is an interface that could be backed by a mutable implementation
  • ❌ Compose doesn’t support collection types

Kotlin’s List interface doesn’t guarantee immutability β€” a MutableList is a valid List. Since Compose can’t prove the list contents haven’t changed between recompositions, it treats List as unstable, preventing recomposition skipping.

What does derivedStateOf prevent?

  • ❌ Memory leaks
  • ❌ State mutations
  • βœ… Unnecessary recompositions when a derived value hasn’t actually changed
  • ❌ Configuration change crashes

derivedStateOf wraps a computation and only triggers downstream recomposition when the computed result is different from the previous result. If the inputs change but the output is the same, recomposition is skipped.

Which modifier defers state reading to the layout phase?

  • ❌ Modifier.offset(x = offsetState.value.dp)
  • βœ… Modifier.offset { IntOffset(offsetState.value, 0) }
  • ❌ Modifier.padding(offsetState.value.dp)
  • ❌ Modifier.size(offsetState.value.dp)

The lambda-based Modifier.offset { } defers the state read to the layout phase. The non-lambda version Modifier.offset(x = ...) reads state during composition, triggering full recomposition on every change.

What is the purpose of contentType in LazyColumn items?

  • ❌ It sets the MIME type of the content
  • ❌ It determines the item size
  • βœ… It enables Compose to reuse composition nodes between items of the same type
  • ❌ It sets the accessibility content description

contentType tells Compose which items share the same composition structure. Compose can then efficiently reuse a β€œpost” slot for other posts rather than rebuilding the composition tree from scratch, similar to view types in RecyclerView.

Coding Challenge: Compose Performance Audit

Build a composable wrapper that tracks and reports recomposition counts, stability violations, and rendering time for child composables.

// Challenge: Create a PerformanceAudit composable that:
// 1. Tracks recomposition count of its content
// 2. Measures composition time in milliseconds
// 3. Logs when recomposition count exceeds a threshold
// 4. Provides a report composable showing audit data
// 5. Can be disabled in release builds with zero overhead

Solution

class ComposePerformanceAuditor {
    private val metrics = ConcurrentHashMap<String, AuditMetrics>()

    data class AuditMetrics(
        var recompositionCount: Int = 0,
        var totalCompositionTimeNs: Long = 0,
        var lastCompositionTimeNs: Long = 0,
        var skippedCount: Int = 0
    ) {
        val averageCompositionTimeMs: Double
            get() = if (recompositionCount > 0)
                (totalCompositionTimeNs / recompositionCount) / 1_000_000.0
            else 0.0

        val skipRate: Double
            get() {
                val total = recompositionCount + skippedCount
                return if (total > 0) skippedCount * 100.0 / total else 0.0
            }
    }

    fun recordComposition(tag: String, durationNs: Long) {
        val metric = metrics.getOrPut(tag) { AuditMetrics() }
        metric.recompositionCount++
        metric.totalCompositionTimeNs += durationNs
        metric.lastCompositionTimeNs = durationNs
    }

    fun getMetrics(tag: String): AuditMetrics? = metrics[tag]

    fun generateReport(): String = buildString {
        appendLine("=== Compose Performance Audit ===")
        metrics.entries.sortedByDescending { it.value.recompositionCount }
            .forEach { (tag, metrics) ->
                appendLine("$tag:")
                appendLine("  Recompositions: ${metrics.recompositionCount}")
                appendLine("  Avg composition: ${"%.3f".format(metrics.averageCompositionTimeMs)}ms")
                appendLine("  Skip rate: ${"%.1f".format(metrics.skipRate)}%")
            }
    }

    fun clear() = metrics.clear()

    companion object {
        val instance = ComposePerformanceAuditor()
    }
}

@Composable
inline fun AuditComposition(
    tag: String,
    threshold: Int = 50,
    enabled: Boolean = BuildConfig.DEBUG,
    content: @Composable () -> Unit
) {
    if (!enabled) {
        content()
        return
    }

    val auditor = remember { ComposePerformanceAuditor.instance }
    val startTime = remember { System.nanoTime() }

    SideEffect {
        val duration = System.nanoTime() - startTime
        auditor.recordComposition(tag, duration)

        val metrics = auditor.getMetrics(tag)
        if (metrics != null && metrics.recompositionCount > threshold) {
            Log.w(
                "ComposeAudit",
                "⚠️ $tag recomposed ${metrics.recompositionCount} times " +
                    "(threshold: $threshold, avg: ${"%.3f".format(metrics.averageCompositionTimeMs)}ms)"
            )
        }
    }

    content()
}

@Composable
fun AuditReport(modifier: Modifier = Modifier) {
    val auditor = remember { ComposePerformanceAuditor.instance }
    val report = remember { mutableStateOf(auditor.generateReport()) }

    LaunchedEffect(Unit) {
        while (true) {
            delay(2000)
            report.value = auditor.generateReport()
        }
    }

    Text(
        text = report.value,
        modifier = modifier,
        fontFamily = FontFamily.Monospace,
        fontSize = 10.sp
    )
}

Module 6: Network Optimization

Network operations are the most variable performance factor in mobile apps. Unlike CPU and memory which you can control, network latency depends on the user’s connection β€” ranging from 2ms on fiber to 2000ms on a spotty 3G connection in a subway. Your job is to minimize the impact of this variability through caching, compression, connection reuse, and smart request strategies. Every millisecond saved on network operations is a millisecond users spend interacting with your app instead of staring at a loading spinner.

Lesson 6.1: OkHttp and Connection Optimization

OkHttp is the HTTP client underlying virtually every Android networking library β€” Retrofit uses it, Ktor can use it, and even the system’s HttpURLConnection delegates to it on modern Android. Understanding OkHttp’s connection management is the foundation of network optimization.

Connection reuse is OkHttp’s most impactful optimization. Creating a new TCP connection requires a three-way handshake (1 round trip), and HTTPS adds a TLS handshake (1-2 additional round trips). On a 200ms latency connection, that’s 400-600ms before a single byte of application data is transferred. OkHttp maintains a connection pool that keeps idle connections alive for reuse. Subsequent requests to the same host reuse the existing connection, eliminating handshake overhead entirely.

// Optimized OkHttp client configuration
val okHttpClient = OkHttpClient.Builder()
    // Connection pool β€” reuse connections
    .connectionPool(ConnectionPool(
        maxIdleConnections = 10,
        keepAliveDuration = 5,
        TimeUnit.MINUTES
    ))

    // Timeouts β€” fail fast on bad connections
    .connectTimeout(15, TimeUnit.SECONDS)
    .readTimeout(30, TimeUnit.SECONDS)
    .writeTimeout(15, TimeUnit.SECONDS)

    // Enable HTTP/2 for multiplexed connections
    .protocols(listOf(Protocol.HTTP_2, Protocol.HTTP_1_1))

    // Transparent compression β€” saves bandwidth
    .addInterceptor(GzipRequestInterceptor())

    // Cache responses for offline and faster repeat loads
    .cache(Cache(
        directory = File(context.cacheDir, "http_cache"),
        maxSize = 50L * 1024 * 1024 // 50MB
    ))

    // Connection spec for modern TLS
    .connectionSpecs(listOf(
        ConnectionSpec.MODERN_TLS,
        ConnectionSpec.CLEARTEXT
    ))

    .build()

// Gzip request interceptor for compressing request bodies
class GzipRequestInterceptor : Interceptor {
    override fun intercept(chain: Interceptor.Chain): Response {
        val originalRequest = chain.request()
        val body = originalRequest.body ?: return chain.proceed(originalRequest)

        val compressedRequest = originalRequest.newBuilder()
            .header("Content-Encoding", "gzip")
            .method(originalRequest.method, gzip(body))
            .build()

        return chain.proceed(compressedRequest)
    }

    private fun gzip(body: RequestBody): RequestBody {
        return object : RequestBody() {
            override fun contentType() = body.contentType()
            override fun contentLength() = -1L

            override fun writeTo(sink: BufferedSink) {
                val gzipSink = GzipSink(sink).buffer()
                body.writeTo(gzipSink)
                gzipSink.close()
            }
        }
    }
}

HTTP/2 multiplexing allows multiple requests to share a single TCP connection simultaneously. Instead of opening separate connections for each request (HTTP/1.1) or waiting for one request to complete before sending the next, HTTP/2 interleaves multiple request/response streams on the same connection. This is particularly impactful on mobile where connection establishment is expensive.

DNS resolution adds another round trip at the start of each new connection. OkHttp provides a Dns interface that you can implement for custom DNS resolution, including DNS-over-HTTPS for privacy and pre-resolving hostnames for faster connections. For apps that connect to a known set of servers, pre-warming connections during startup eliminates connection latency entirely for the first request.

// Pre-warm connections during app startup
class NetworkWarmup(private val client: OkHttpClient) {

    fun warmup(hosts: List<String>) {
        val dispatcher = client.dispatcher
        hosts.forEach { host ->
            val request = Request.Builder()
                .url("https://$host")
                .head() // Minimal request β€” just establish connection
                .build()

            client.newCall(request).enqueue(object : Callback {
                override fun onFailure(call: Call, e: IOException) {
                    // Connection warmup failed β€” non-critical
                }

                override fun onResponse(call: Call, response: Response) {
                    response.close()
                }
            })
        }
    }
}

// Usage during app startup
class MyApplication : Application() {
    override fun onCreate() {
        super.onCreate()

        CoroutineScope(Dispatchers.IO).launch {
            NetworkWarmup(okHttpClient).warmup(
                listOf("api.example.com", "cdn.example.com")
            )
        }
    }
}

Key takeaway: Connection reuse eliminates TCP and TLS handshake overhead β€” configure OkHttp’s connection pool appropriately. Enable HTTP/2 for request multiplexing. Pre-warm connections to critical servers during app startup to eliminate latency on the first user-facing request.

Lesson 6.2: Response Caching Strategies

Caching is the most effective network optimization because the fastest network request is the one you never make. OkHttp supports HTTP caching natively β€” responses with appropriate Cache-Control headers are stored on disk and returned without making a network request on subsequent loads.

The caching strategy depends on the data freshness requirements. Static resources (images, fonts, configuration) can be cached aggressively with long max-age values. Dynamic content (user profiles, feeds) benefits from stale-while-revalidate β€” serve the cached version immediately while fetching a fresh copy in the background. Real-time data (chat messages, live prices) should bypass the cache entirely.

// OkHttp cache with custom caching interceptor
val cacheInterceptor = Interceptor { chain ->
    val request = chain.request()
    val response = chain.proceed(request)

    // Add caching headers if server doesn't provide them
    val cacheControl = CacheControl.Builder()
        .maxAge(5, TimeUnit.MINUTES)
        .maxStale(1, TimeUnit.HOURS)
        .build()

    response.newBuilder()
        .removeHeader("Pragma")
        .removeHeader("Cache-Control")
        .header("Cache-Control", cacheControl.toString())
        .build()
}

// Force cache for offline support
val offlineCacheInterceptor = Interceptor { chain ->
    var request = chain.request()

    if (!isNetworkAvailable()) {
        val cacheControl = CacheControl.Builder()
            .maxStale(7, TimeUnit.DAYS)
            .onlyIfCached()
            .build()

        request = request.newBuilder()
            .cacheControl(cacheControl)
            .build()
    }

    chain.proceed(request)
}

val client = OkHttpClient.Builder()
    .addInterceptor(offlineCacheInterceptor)   // App-level
    .addNetworkInterceptor(cacheInterceptor)    // Network-level
    .cache(Cache(File(cacheDir, "http"), 50L * 1024 * 1024))
    .build()

For application-level caching, implement a repository pattern with a memory cache (fast, volatile) backed by a database cache (persistent, slightly slower) and the network source (freshest, slowest). This three-layer strategy serves data in milliseconds from memory, falls back to local database when the app restarts, and only hits the network when data is stale or missing.

class UserRepository(
    private val api: UserApi,
    private val database: UserDao,
    private val memoryCache: LruCache<String, User> = LruCache(50)
) {
    suspend fun getUser(userId: String): Flow<Resource<User>> = flow {
        // Layer 1: Memory cache β€” instant
        memoryCache.get(userId)?.let { cached ->
            emit(Resource.Success(cached))
        }

        // Layer 2: Database cache β€” fast, survives process death
        val dbUser = database.getUserById(userId)
        if (dbUser != null) {
            memoryCache.put(userId, dbUser)
            emit(Resource.Success(dbUser))
        }

        // Layer 3: Network β€” freshest data
        try {
            val networkUser = api.getUser(userId)
            database.insertUser(networkUser)
            memoryCache.put(userId, networkUser)
            emit(Resource.Success(networkUser))
        } catch (e: Exception) {
            if (dbUser == null) {
                emit(Resource.Error(e))
            }
            // If we already emitted cached data, the error is non-fatal
        }
    }
}

sealed class Resource<out T> {
    data class Success<T>(val data: T) : Resource<T>()
    data class Error(val exception: Throwable) : Resource<Nothing>()
    data object Loading : Resource<Nothing>()
}

Key takeaway: The fastest network request is one you don’t make. Implement three-layer caching: in-memory LruCache for instant access, database for persistence across sessions, and network for freshness. Use OkHttp’s built-in cache for HTTP responses and stale-while-revalidate for showing cached data while refreshing.

Lesson 6.3: Request Optimization and Batching

Every network request has fixed overhead: DNS resolution, connection establishment, request/response headers, and serialization. For apps that make many small requests, this overhead dominates total network time. Request batching combines multiple logical requests into a single network call, amortizing the fixed overhead across all batched operations.

GraphQL is a natural fit for request batching β€” a single GraphQL query can fetch data that would require multiple REST endpoints. Even with REST APIs, you can batch requests by designing aggregate endpoints that return multiple resources in a single response, or by using HTTP/2 multiplexing to send multiple requests over a single connection without waiting for individual responses.

// Request batching with a custom batcher
class RequestBatcher<K, V>(
    private val scope: CoroutineScope,
    private val maxBatchSize: Int = 25,
    private val maxDelayMs: Long = 50,
    private val executeBatch: suspend (List<K>) -> Map<K, V>
) {
    private val pendingRequests = Channel<BatchEntry<K, V>>(Channel.UNLIMITED)

    data class BatchEntry<K, V>(
        val key: K,
        val deferred: CompletableDeferred<V>
    )

    init {
        scope.launch {
            val batch = mutableListOf<BatchEntry<K, V>>()

            while (true) {
                // Wait for first request
                val first = pendingRequests.receive()
                batch.add(first)

                // Collect more for up to maxDelayMs
                val deadline = System.currentTimeMillis() + maxDelayMs
                while (batch.size < maxBatchSize) {
                    val remaining = deadline - System.currentTimeMillis()
                    if (remaining <= 0) break

                    val next = withTimeoutOrNull(remaining) {
                        pendingRequests.receive()
                    } ?: break
                    batch.add(next)
                }

                // Execute batch
                try {
                    val results = executeBatch(batch.map { it.key })
                    batch.forEach { entry ->
                        val result = results[entry.key]
                        if (result != null) {
                            entry.deferred.complete(result)
                        } else {
                            entry.deferred.completeExceptionally(
                                NoSuchElementException("Key not found: ${entry.key}")
                            )
                        }
                    }
                } catch (e: Exception) {
                    batch.forEach { it.deferred.completeExceptionally(e) }
                }

                batch.clear()
            }
        }
    }

    suspend fun load(key: K): V {
        val deferred = CompletableDeferred<V>()
        pendingRequests.send(BatchEntry(key, deferred))
        return deferred.await()
    }
}

// Usage β€” batch user profile fetches
val userBatcher = RequestBatcher<String, User>(
    scope = viewModelScope,
    maxBatchSize = 20,
    maxDelayMs = 50
) { userIds ->
    api.getUsersBatch(userIds) // Single API call for all IDs
        .associateBy { it.id }
}

// Individual callers don't know about batching
suspend fun getUserProfile(userId: String): User {
    return userBatcher.load(userId)
}

Pagination is another form of request optimization. Instead of loading an entire dataset (which may be thousands of items), load a small page and fetch more as the user scrolls. Android’s Paging 3 library handles this with built-in support for different data sources, error handling, and retry logic.

Prefetching anticipates what data the user will need next and loads it before they request it. If the user is scrolling through a list, prefetch the next page before they reach the bottom. If they’re on a product list, prefetch details for the first few visible products. The risk is wasting bandwidth on data the user never views β€” balance prefetching aggressiveness with data cost.

// Smart prefetching based on scroll behavior
class PrefetchingViewModel(
    private val repository: ContentRepository
) : ViewModel() {

    private val _items = MutableStateFlow<List<ContentItem>>(emptyList())
    val items: StateFlow<List<ContentItem>> = _items.asStateFlow()

    private var currentPage = 0
    private var isPrefetching = false
    private var hasMorePages = true

    fun onVisibleItemsChanged(firstVisible: Int, lastVisible: Int) {
        val totalItems = _items.value.size
        val prefetchThreshold = totalItems - 10

        if (lastVisible >= prefetchThreshold && !isPrefetching && hasMorePages) {
            prefetchNextPage()
        }
    }

    private fun prefetchNextPage() {
        isPrefetching = true
        viewModelScope.launch {
            try {
                val nextPage = repository.getPage(++currentPage)
                if (nextPage.isEmpty()) {
                    hasMorePages = false
                } else {
                    _items.value = _items.value + nextPage
                }
            } catch (e: Exception) {
                currentPage--
            } finally {
                isPrefetching = false
            }
        }
    }
}

Key takeaway: Batch multiple small requests into single API calls to amortize connection overhead. Implement pagination to avoid loading unnecessary data. Prefetch anticipated content before the user requests it β€” but balance aggressiveness with bandwidth cost.

Lesson 6.4: Data Serialization Performance

Serialization β€” converting between objects and network formats (JSON, Protobuf, etc.) β€” can be a significant performance cost, especially for large API responses. The choice of serialization library and format directly impacts parsing speed, memory usage, and APK size.

Kotlin Serialization (kotlinx.serialization) generates serialization code at compile time, avoiding the reflection overhead of Moshi and Gson. For performance-critical paths, it’s the fastest JSON parser in the Kotlin ecosystem. Moshi with code generation (not reflection) is a close second. Gson uses reflection for everything and is the slowest option β€” it’s also the oldest and most likely to be in legacy codebases.

// kotlinx.serialization β€” fastest for Kotlin
@Serializable
data class ApiResponse(
    @SerialName("user_id") val userId: String,
    @SerialName("display_name") val displayName: String,
    val email: String,
    @SerialName("created_at") val createdAt: Long,
    val settings: UserSettings
)

@Serializable
data class UserSettings(
    val notifications: Boolean = true,
    val theme: String = "system",
    val language: String = "en"
)

// Configure with lenient parsing for resilient API consumption
val json = Json {
    ignoreUnknownKeys = true      // Don't crash on new API fields
    isLenient = true               // Accept slightly malformed JSON
    coerceInputValues = true       // Use defaults for null non-nullable fields
    encodeDefaults = false         // Don't serialize default values (smaller payload)
}

// Retrofit integration
val retrofit = Retrofit.Builder()
    .baseUrl(BASE_URL)
    .client(okHttpClient)
    .addConverterFactory(json.asConverterFactory("application/json".toMediaType()))
    .build()

Protocol Buffers (Protobuf) are even faster than JSON for both serialization and deserialization, and produce significantly smaller payloads. A typical Protobuf message is 3-10x smaller than its JSON equivalent because it uses a binary format with field numbers instead of field names. The downside is reduced human readability β€” you can’t inspect Protobuf payloads with a text editor.

For large responses, consider streaming parsers. Instead of loading the entire response into memory and then parsing it, a streaming parser processes the JSON token by token, emitting objects as they’re parsed. This reduces peak memory usage from O(response_size) to O(single_object_size), which is critical for responses containing thousands of items.

// Streaming JSON parsing with kotlinx.serialization
suspend fun parselargeResponse(inputStream: InputStream): List<Item> {
    return withContext(Dispatchers.IO) {
        val items = mutableListOf<Item>()
        val reader = JsonReader(InputStreamReader(inputStream))

        reader.beginArray()
        while (reader.hasNext()) {
            val item = json.decodeFromJsonElement<Item>(
                Json.parseToJsonElement(reader.nextString())
            )
            items.add(item)
        }
        reader.endArray()
        reader.close()

        items
    }
}

// Efficient response handling β€” don't read entire body into string
interface ApiService {
    @GET("feed")
    @Streaming // Prevents loading entire response into memory
    suspend fun getFeed(): ResponseBody
}

Key takeaway: Use kotlinx.serialization with code generation for the fastest JSON parsing in Kotlin. Consider Protocol Buffers for high-throughput APIs β€” they’re 3-10x smaller and faster than JSON. Stream large responses instead of loading them entirely into memory.

Lesson 6.5: Image Loading Optimization

Images typically account for 60-80% of an app’s network traffic. Optimizing image loading isn’t just about display performance β€” it’s about bandwidth, battery life, and user data costs. The right approach combines request-time optimization (loading the right size), format optimization (WebP, AVIF), and caching (avoid re-downloading).

Request the correct image size from the server. Loading a 2000x2000 pixel image to display in a 100x100dp thumbnail wastes 99% of the bandwidth. If your image server supports dynamic sizing (most CDNs do), pass the target dimensions as URL parameters. Coil and Glide both support this through custom ModelLoader implementations.

// Coil image loading with size optimization
@Composable
fun ProductImage(
    imageUrl: String,
    modifier: Modifier = Modifier
) {
    AsyncImage(
        model = ImageRequest.Builder(LocalContext.current)
            .data(imageUrl)
            .crossfade(300)
            // Request exact size from CDN
            .size(Size.ORIGINAL)
            .transformations(
                CircleCropTransformation()
            )
            .memoryCacheKey(imageUrl)
            .diskCacheKey(imageUrl)
            // Placeholder while loading
            .placeholder(R.drawable.placeholder_product)
            .error(R.drawable.error_product)
            // Reduce memory for non-transparent images
            .allowRgb565(true)
            .build(),
        contentDescription = "Product image",
        modifier = modifier,
        contentScale = ContentScale.Crop
    )
}

// Custom CDN URL builder that appends size parameters
class CdnImageUrlBuilder(private val baseUrl: String) {

    fun buildUrl(
        imageId: String,
        widthPx: Int,
        heightPx: Int,
        format: ImageFormat = ImageFormat.WEBP,
        quality: Int = 80
    ): String {
        return buildString {
            append(baseUrl)
            append("/images/")
            append(imageId)
            append("?w=$widthPx")
            append("&h=$heightPx")
            append("&fmt=${format.extension}")
            append("&q=$quality")
        }
    }

    enum class ImageFormat(val extension: String) {
        WEBP("webp"),
        AVIF("avif"),
        JPEG("jpg"),
        PNG("png")
    }
}

WebP images are 25-35% smaller than JPEG at equivalent quality. AVIF is even smaller β€” 50% reduction compared to JPEG β€” but encoding is slower and device support is more limited (API 31+). Use WebP as your default format and AVIF for newer devices when your CDN supports it. Always serve images in the most efficient format the client supports.

Preloading images that will be needed soon (scrolling ahead in a feed, next page of content) eliminates the loading delay when the user reaches them. But be judicious β€” preloading too aggressively wastes bandwidth on images the user may never see. A good heuristic is preloading the next screen’s worth of images.

// Preload images for the next page
class ImagePreloader(
    private val context: Context,
    private val imageLoader: ImageLoader
) {
    fun preloadImages(urls: List<String>, sizePx: Int = 200) {
        urls.forEach { url ->
            val request = ImageRequest.Builder(context)
                .data(url)
                .size(sizePx)
                .memoryCachePolicy(CachePolicy.ENABLED)
                .diskCachePolicy(CachePolicy.ENABLED)
                .priority(Priority.LOW) // Don't compete with visible images
                .build()

            imageLoader.enqueue(request)
        }
    }
}

Key takeaway: Request images at the exact size you need β€” never load full-resolution images for thumbnails. Use WebP for 25-35% bandwidth savings over JPEG. Preload upcoming images at low priority. Let Coil or Glide handle memory management, caching, and format selection.

Quiz: Network Optimization

What is the main benefit of HTTP/2 multiplexing?

  • ❌ Faster DNS resolution
  • ❌ Better compression
  • βœ… Multiple requests share a single TCP connection simultaneously
  • ❌ Reduced server load

HTTP/2 multiplexing allows multiple request/response streams to share a single TCP connection. This eliminates the need for multiple connections and avoids head-of-line blocking that HTTP/1.1 suffers from with pipelining.

What is the fastest JSON serialization approach in Kotlin?

  • ❌ Gson with reflection
  • ❌ Moshi with reflection
  • βœ… kotlinx.serialization with compile-time code generation
  • ❌ Manual string parsing

kotlinx.serialization generates serialization code at compile time, avoiding all reflection overhead. This makes it faster than Moshi (which also supports code generation but with slightly more overhead) and significantly faster than Gson (which uses reflection exclusively).

What is the ideal caching strategy for a social media feed?

  • ❌ No caching β€” always fetch fresh
  • ❌ Cache for 24 hours β€” never refetch
  • βœ… Show cached data immediately, refresh from network in background
  • ❌ Cache only images, never cache text

A feed should show cached data instantly for responsiveness, then update from the network in the background. This gives users immediate content to interact with while ensuring they eventually see the latest posts.

How much smaller are WebP images compared to JPEG?

  • ❌ 5-10% smaller
  • βœ… 25-35% smaller
  • ❌ 50-60% smaller
  • ❌ Same size, just different format

WebP typically achieves 25-35% smaller file sizes compared to JPEG at equivalent visual quality. This translates directly to bandwidth savings, faster loading, and reduced data costs for users.

Coding Challenge: Offline-First Repository

Build a repository that works offline-first: serves cached data immediately, syncs with the server when online, and handles conflict resolution.

// Challenge: Build an OfflineFirstRepository that:
// 1. Returns cached data immediately from Room database
// 2. Syncs with server when network is available
// 3. Queues writes when offline and syncs when back online
// 4. Handles sync conflicts with last-write-wins strategy
// 5. Reports sync status (synced, pending, conflict)

Solution

class OfflineFirstRepository<T : Syncable>(
    private val localSource: LocalDataSource<T>,
    private val remoteSource: RemoteDataSource<T>,
    private val connectivityMonitor: ConnectivityMonitor,
    private val scope: CoroutineScope
) {
    private val pendingQueue = Channel<SyncOperation<T>>(Channel.UNLIMITED)

    sealed class SyncStatus {
        data object Synced : SyncStatus()
        data object Pending : SyncStatus()
        data class Conflict(val message: String) : SyncStatus()
        data class Error(val error: Throwable) : SyncStatus()
    }

    data class SyncOperation<T>(
        val type: OperationType,
        val item: T,
        val timestamp: Long = System.currentTimeMillis()
    )

    enum class OperationType { CREATE, UPDATE, DELETE }

    init {
        scope.launch { processPendingQueue() }
        scope.launch { observeConnectivity() }
    }

    fun observe(id: String): Flow<Pair<T?, SyncStatus>> {
        return localSource.observe(id).map { local ->
            val status = when {
                local == null -> SyncStatus.Pending
                local.lastSyncedAt >= local.lastModifiedAt -> SyncStatus.Synced
                else -> SyncStatus.Pending
            }
            local to status
        }
    }

    suspend fun get(id: String): T? {
        // Always return local first
        val local = localSource.get(id)

        // Try to refresh from remote
        if (connectivityMonitor.isOnline()) {
            try {
                val remote = remoteSource.get(id)
                if (remote != null) {
                    val resolved = resolveConflict(local, remote)
                    localSource.save(resolved)
                    return resolved
                }
            } catch (e: Exception) {
                // Network failed β€” return local data
            }
        }

        return local
    }

    suspend fun save(item: T) {
        val updated = item.withModifiedAt(System.currentTimeMillis())
        localSource.save(updated)

        if (connectivityMonitor.isOnline()) {
            try {
                val synced = remoteSource.save(updated)
                localSource.save(synced.withSyncedAt(System.currentTimeMillis()))
            } catch (e: Exception) {
                pendingQueue.send(SyncOperation(OperationType.UPDATE, updated))
            }
        } else {
            pendingQueue.send(SyncOperation(OperationType.UPDATE, updated))
        }
    }

    private fun resolveConflict(local: T?, remote: T): T {
        if (local == null) return remote
        // Last-write-wins strategy
        return if (local.lastModifiedAt > remote.lastModifiedAt) local else remote
    }

    private suspend fun processPendingQueue() {
        for (operation in pendingQueue) {
            while (!connectivityMonitor.isOnline()) {
                delay(5000) // Wait for connectivity
            }

            try {
                when (operation.type) {
                    OperationType.CREATE, OperationType.UPDATE -> {
                        val synced = remoteSource.save(operation.item)
                        localSource.save(
                            synced.withSyncedAt(System.currentTimeMillis())
                        )
                    }
                    OperationType.DELETE -> {
                        remoteSource.delete(operation.item.id)
                        localSource.delete(operation.item.id)
                    }
                }
            } catch (e: Exception) {
                delay(10000) // Retry after delay
                pendingQueue.send(operation) // Re-queue
            }
        }
    }

    private suspend fun observeConnectivity() {
        connectivityMonitor.observeConnectivity().collect { isOnline ->
            if (isOnline) {
                syncAll()
            }
        }
    }

    private suspend fun syncAll() {
        val unsyncedItems = localSource.getUnsynced()
        unsyncedItems.forEach { item ->
            pendingQueue.send(SyncOperation(OperationType.UPDATE, item))
        }
    }
}

interface Syncable {
    val id: String
    val lastModifiedAt: Long
    val lastSyncedAt: Long
    fun withModifiedAt(timestamp: Long): Syncable
    fun withSyncedAt(timestamp: Long): Syncable
}

Module 7: Threading and Coroutines Performance

Threading is where performance meets correctness. The wrong threading model causes either jank (too much work on the main thread) or crashes (accessing UI from a background thread). Kotlin coroutines provide structured concurrency that makes correct threading easier, but they introduce their own performance characteristics that you need to understand. This module teaches you to use dispatchers, scopes, and concurrency patterns for optimal performance.

Lesson 7.1: Dispatcher Selection and Performance

Coroutine dispatchers determine which thread pool executes your code. Choosing the wrong dispatcher is a common performance mistake β€” CPU-bound work on Dispatchers.IO wastes threads, and I/O-bound work on Dispatchers.Default starves CPU-bound work. Understanding what each dispatcher is optimized for is essential.

Dispatchers.Main executes on the Android main thread. Use it for UI updates and short operations that must run on the main thread. Dispatchers.Main.immediate is an optimization β€” if you’re already on the main thread, it executes the coroutine immediately without redispatching, avoiding the overhead of posting to the main thread’s message queue.

Dispatchers.Default uses a shared thread pool sized to the number of CPU cores. It’s optimized for CPU-intensive work β€” sorting, parsing, computing. The limited thread count prevents CPU oversubscription, where more threads than cores compete for CPU time, wasting cycles on context switching.

Dispatchers.IO uses a much larger thread pool (default 64 threads) designed for blocking I/O operations β€” network calls, disk reads, database queries. These operations spend most of their time waiting, so having many threads is efficient. But running CPU-intensive work on Dispatchers.IO wastes threads β€” your CPU-bound computation occupies a thread while 63 other I/O threads sit idle.

class DataProcessor(
    private val api: ApiService,
    private val database: AppDatabase,
    private val parser: DataParser
) {
    // βœ… Correct dispatcher selection
    suspend fun processData(): Result<ProcessedData> {
        return withContext(Dispatchers.IO) {
            // I/O-bound: network call
            val rawData = api.fetchData()

            // Switch to Default for CPU-bound parsing
            val parsed = withContext(Dispatchers.Default) {
                parser.parse(rawData) // CPU-intensive
            }

            // Back to IO for database write
            database.insertParsed(parsed)

            Result.success(parsed)
        }
    }

    // βœ… Main.immediate for UI updates without redispatch
    suspend fun updateUiState(state: UiState) {
        withContext(Dispatchers.Main.immediate) {
            // If already on main thread, executes immediately
            // If on background thread, dispatches to main
            _uiState.value = state
        }
    }
}

For custom threading needs, create a confined dispatcher using newSingleThreadContext() for operations that must be serialized (single-threaded access to a non-thread-safe resource) or newFixedThreadPoolContext() for bounded parallelism. But prefer the standard dispatchers unless you have a specific reason β€” they’re shared across your entire app, reducing total thread count and context-switching overhead.

// Custom dispatcher for single-threaded database access
private val databaseDispatcher = newSingleThreadContext("DatabaseThread")

suspend fun performDatabaseOperation() {
    withContext(databaseDispatcher) {
        // All database operations serialized on a single thread
        // No concurrent access β€” no need for synchronization
        database.beginTransaction()
        try {
            database.insertItems(items)
            database.updateTimestamp(System.currentTimeMillis())
            database.setTransactionSuccessful()
        } finally {
            database.endTransaction()
        }
    }
}

// limitedParallelism for controlling concurrency
val limitedIo = Dispatchers.IO.limitedParallelism(4)

suspend fun processFiles(files: List<File>) {
    coroutineScope {
        files.map { file ->
            async(limitedIo) {
                // At most 4 files processed concurrently
                processFile(file)
            }
        }.awaitAll()
    }
}

Key takeaway: Use Dispatchers.Default for CPU-bound work (sized to CPU cores), Dispatchers.IO for blocking I/O (sized to 64 threads), and Dispatchers.Main.immediate for UI updates (avoids unnecessary redispatch). Use limitedParallelism() to control concurrency within a dispatcher.

Lesson 7.2: Structured Concurrency and Cancellation

Structured concurrency ensures that coroutines have a defined lifecycle β€” they start, run, and finish within a scope. When the scope is cancelled (Activity destroyed, ViewModel cleared, Fragment detached), all coroutines in that scope are cancelled automatically. This prevents leaked coroutines that continue running after their result is no longer needed, wasting CPU, memory, and battery.

The performance impact of proper cancellation is significant. Without it, navigating away from a screen while a network request is in progress means the coroutine continues executing, parsing the response, and updating state that no one will ever see. Multiply this by rapid navigation (user quickly browsing through screens) and you have dozens of zombie coroutines consuming resources.

// βœ… Structured concurrency β€” cancelled automatically
class SearchViewModel : ViewModel() {

    private val _results = MutableStateFlow<List<SearchResult>>(emptyList())
    val results: StateFlow<List<SearchResult>> = _results.asStateFlow()

    private var searchJob: Job? = null

    fun search(query: String) {
        // Cancel previous search β€” only the latest matters
        searchJob?.cancel()

        searchJob = viewModelScope.launch {
            delay(300) // Debounce

            try {
                val results = withContext(Dispatchers.IO) {
                    searchRepository.search(query)
                }
                _results.value = results
            } catch (e: CancellationException) {
                // Expected when search is cancelled β€” don't log as error
                throw e
            } catch (e: Exception) {
                _results.value = emptyList()
            }
        }
    }

    // viewModelScope is cancelled in onCleared() automatically
}

// ❌ Unstructured concurrency β€” leaked coroutine
class LeakyViewModel : ViewModel() {
    fun fetchData() {
        // This coroutine is NOT cancelled when ViewModel is cleared
        GlobalScope.launch {
            val data = api.fetchData() // Continues after ViewModel death
            _state.value = data // Updates state no one observes
        }
    }
}

Making your suspend functions cooperative with cancellation is important for performance. A long-running CPU-bound computation doesn’t check for cancellation by default. If the coroutine is cancelled, the computation continues until it completes β€” wasting CPU time. Use ensureActive() or yield() at regular intervals in CPU-bound loops to check for cancellation.

// Cancellation-cooperative CPU-bound work
suspend fun processLargeDataset(
    items: List<RawItem>
): List<ProcessedItem> = withContext(Dispatchers.Default) {
    items.mapIndexed { index, item ->
        // Check for cancellation every 100 items
        if (index % 100 == 0) {
            ensureActive() // Throws CancellationException if cancelled
        }

        // CPU-intensive processing
        processItem(item)
    }
}

// yield() is similar but also gives other coroutines a chance to run
suspend fun compressFiles(files: List<File>) = withContext(Dispatchers.Default) {
    files.forEach { file ->
        yield() // Check cancellation AND let other coroutines run
        compressFile(file)
    }
}

SupervisorJob prevents one child’s failure from cancelling siblings. In a scope with a regular Job, if any child coroutine fails with an exception, all other children are cancelled. With SupervisorJob, each child is independent β€” one can fail while others continue. Use SupervisorJob when child coroutines are independent operations (loading different sections of a screen) and a regular Job when children are part of a single logical operation (all steps must succeed).

Key takeaway: Always use structured concurrency (viewModelScope, lifecycleScope) β€” never GlobalScope. Cancel previous work when starting new work (search debouncing). Make CPU-bound loops cancellation-cooperative with ensureActive(). Use SupervisorJob when child coroutine failures should be independent.

Lesson 7.3: Flow Performance Patterns

Kotlin Flow is a cold stream that executes lazily β€” the producer doesn’t run until someone collects the flow. This is fundamentally different from hot streams (StateFlow, SharedFlow) which emit regardless of collectors. Understanding when to use each type and how to optimize flow collection is critical for performance.

StateFlow vs SharedFlow vs cold Flow β€” use StateFlow for representing current state (always has a value, replays the latest value to new collectors). Use SharedFlow for events (no initial value, configurable replay). Use cold Flow for one-shot data loading or database observation where each collector should trigger its own execution.

// βœ… StateFlow for UI state β€” always has a value
class ProfileViewModel(
    private val userRepository: UserRepository
) : ViewModel() {

    private val _uiState = MutableStateFlow(ProfileUiState())
    val uiState: StateFlow<ProfileUiState> = _uiState.asStateFlow()

    init {
        viewModelScope.launch {
            userRepository.observeUser()
                .map { user -> ProfileUiState(user = user, isLoading = false) }
                .catch { e -> emit(ProfileUiState(error = e.message)) }
                .collect { state -> _uiState.value = state }
        }
    }
}

// βœ… SharedFlow for one-time events β€” no replay by default
class NavigationViewModel : ViewModel() {

    private val _events = MutableSharedFlow<NavigationEvent>()
    val events: SharedFlow<NavigationEvent> = _events.asSharedFlow()

    fun navigateTo(destination: String) {
        viewModelScope.launch {
            _events.emit(NavigationEvent.Navigate(destination))
        }
    }
}

collectAsStateWithLifecycle() is the correct way to collect flows in Compose. It’s lifecycle-aware β€” it stops collection when the composable is not visible (paused, stopped) and resumes when it’s visible again. This prevents unnecessary work when the app is in the background.

// βœ… Lifecycle-aware collection in Compose
@Composable
fun ProfileScreen(viewModel: ProfileViewModel = viewModel()) {
    val uiState by viewModel.uiState.collectAsStateWithLifecycle()

    when {
        uiState.isLoading -> LoadingIndicator()
        uiState.error != null -> ErrorMessage(uiState.error!!)
        uiState.user != null -> ProfileContent(uiState.user!!)
    }
}

// βœ… stateIn for converting cold Flow to StateFlow with sharing
class SettingsViewModel(
    settingsRepository: SettingsRepository
) : ViewModel() {

    val settings: StateFlow<Settings> = settingsRepository
        .observeSettings()
        .stateIn(
            scope = viewModelScope,
            started = SharingStarted.WhileSubscribed(5000),
            initialValue = Settings.DEFAULT
        )
    // WhileSubscribed(5000) β€” keeps the flow active for 5 seconds
    // after the last subscriber disconnects. Survives configuration
    // changes without restarting the flow.
}

Flow operators have performance implications. map, filter, and take are lightweight. flatMapLatest cancels the previous inner flow when a new value arrives β€” perfect for search-as-you-type. conflate drops intermediate values when the collector is slow β€” ideal for high-frequency updates where only the latest value matters (sensor data, animation values). buffer decouples the producer and collector, allowing the producer to emit ahead of consumption.

// Flow performance patterns
class SensorViewModel(
    private val sensorRepository: SensorRepository
) : ViewModel() {

    // conflate β€” drop intermediate values, only process latest
    val sensorData: StateFlow<SensorData> = sensorRepository
        .observeSensor()
        .conflate() // Skip values collector can't keep up with
        .map { raw -> processSensorData(raw) }
        .stateIn(viewModelScope, SharingStarted.WhileSubscribed(5000), SensorData.EMPTY)

    // flatMapLatest β€” cancel previous work on new input
    private val _searchQuery = MutableStateFlow("")

    val searchResults: StateFlow<List<Result>> = _searchQuery
        .debounce(300)
        .distinctUntilChanged()
        .flatMapLatest { query ->
            if (query.isBlank()) flowOf(emptyList())
            else searchRepository.search(query)
        }
        .stateIn(viewModelScope, SharingStarted.WhileSubscribed(5000), emptyList())
}

Key takeaway: Use StateFlow for UI state, SharedFlow for events, and cold Flow for one-shot operations. Collect with collectAsStateWithLifecycle() in Compose. Use conflate for high-frequency data, flatMapLatest for search-like patterns, and WhileSubscribed(5000) for surviving configuration changes.

Lesson 7.4: Parallel Decomposition

When you have multiple independent operations, running them sequentially wastes time. Parallel decomposition splits work into concurrent tasks that run simultaneously, reducing total execution time from the sum of all tasks to the duration of the longest single task.

async/await is the primary tool for parallel decomposition. Launch multiple async blocks within a coroutineScope and awaitAll() to wait for all results. The coroutineScope ensures structured concurrency β€” if any async block fails, all siblings are cancelled.

// ❌ Sequential β€” total time = sum of all operations
suspend fun loadDashboard(): DashboardData {
    val user = userRepository.getUser()           // 200ms
    val notifications = notifRepository.getAll()   // 150ms
    val feed = feedRepository.getLatest()           // 300ms
    val recommendations = recRepository.get()       // 250ms

    // Total: 900ms
    return DashboardData(user, notifications, feed, recommendations)
}

// βœ… Parallel β€” total time = longest single operation
suspend fun loadDashboard(): DashboardData = coroutineScope {
    val user = async { userRepository.getUser() }
    val notifications = async { notifRepository.getAll() }
    val feed = async { feedRepository.getLatest() }
    val recommendations = async { recRepository.get() }

    // Total: 300ms (longest single operation)
    DashboardData(
        user = user.await(),
        notifications = notifications.await(),
        feed = feed.await(),
        recommendations = recommendations.await()
    )
}

// Parallel with partial results β€” show data as it arrives
suspend fun loadDashboardProgressive(): Flow<DashboardData> = flow {
    var dashboard = DashboardData.EMPTY
    emit(dashboard)

    coroutineScope {
        // Launch all requests
        launch {
            val user = userRepository.getUser()
            dashboard = dashboard.copy(user = user)
            emit(dashboard) // Emit as soon as user is ready
        }
        launch {
            val feed = feedRepository.getLatest()
            dashboard = dashboard.copy(feed = feed)
            emit(dashboard) // Emit as soon as feed is ready
        }
        launch {
            val notifs = notifRepository.getAll()
            dashboard = dashboard.copy(notifications = notifs)
            emit(dashboard)
        }
    }
}

For processing large collections in parallel, use chunked to divide the work and async to process chunks concurrently. Control parallelism with Semaphore to avoid overwhelming resources β€” launching 10,000 concurrent network requests is worse than sequential because it floods the connection pool.

// Bounded parallel processing with Semaphore
suspend fun processItems(
    items: List<Item>,
    maxConcurrency: Int = 8
): List<ProcessedItem> {
    val semaphore = Semaphore(maxConcurrency)

    return coroutineScope {
        items.map { item ->
            async(Dispatchers.Default) {
                semaphore.withPermit {
                    processItem(item)
                }
            }
        }.awaitAll()
    }
}

// Fan-out / fan-in pattern with channels
suspend fun processWithChannels(
    items: List<Item>,
    workerCount: Int = 4
): List<ProcessedItem> = coroutineScope {
    val inputChannel = Channel<Item>(Channel.UNLIMITED)
    val outputChannel = Channel<ProcessedItem>(Channel.UNLIMITED)

    // Fan-out: multiple workers consume from input channel
    repeat(workerCount) {
        launch(Dispatchers.Default) {
            for (item in inputChannel) {
                outputChannel.send(processItem(item))
            }
        }
    }

    // Send all items
    launch {
        items.forEach { inputChannel.send(it) }
        inputChannel.close()
    }

    // Fan-in: collect all results
    val results = mutableListOf<ProcessedItem>()
    repeat(items.size) {
        results.add(outputChannel.receive())
    }

    results
}

Key takeaway: Use async/await within coroutineScope for parallel independent operations β€” total time equals the longest task, not the sum. Limit parallelism with Semaphore to avoid resource exhaustion. Use channels for producer-consumer patterns with controlled fan-out.

Lesson 7.5: Main Thread Safety

The main thread (UI thread) must never be blocked. Any operation that takes more than a few milliseconds should run on a background dispatcher. But identifying what counts as β€œblocking” isn’t always obvious β€” disk I/O, SharedPreferences reads, database queries, and even some collection operations can block the main thread unexpectedly.

StrictMode is your development-time safety net for main thread violations. Enable it in debug builds to detect disk reads, disk writes, network operations, and slow calls on the main thread. When a violation occurs, StrictMode can log a warning, flash the screen red, or crash the app β€” making violations impossible to miss.

// Comprehensive StrictMode for development
class DebugApp : Application() {
    override fun onCreate() {
        super.onCreate()

        if (BuildConfig.DEBUG) {
            StrictMode.setThreadPolicy(
                StrictMode.ThreadPolicy.Builder()
                    .detectAll()
                    .penaltyLog()
                    .penaltyFlashScreen()
                    .build()
            )

            StrictMode.setVmPolicy(
                StrictMode.VmPolicy.Builder()
                    .detectAll()
                    .penaltyLog()
                    .build()
            )
        }
    }
}

// Common main thread violations and fixes

// ❌ SharedPreferences.edit().commit() blocks main thread
fun savePreference(key: String, value: String) {
    prefs.edit().putString(key, value).commit() // Blocks!
}

// βœ… Use apply() for async writes
fun savePreference(key: String, value: String) {
    prefs.edit().putString(key, value).apply() // Non-blocking
}

// ❌ Reading SharedPreferences on main thread can block
// (initial load reads from disk)
val username = prefs.getString("username", "") // May block!

// βœ… Read on background thread
suspend fun getUsername(): String = withContext(Dispatchers.IO) {
    prefs.getString("username", "") ?: ""
}

// βœ… Or use DataStore which is async by default
val usernameFlow: Flow<String> = dataStore.data
    .map { preferences -> preferences[USERNAME_KEY] ?: "" }

DataStore is the modern replacement for SharedPreferences. Unlike SharedPreferences, DataStore is built on coroutines and Flow, making all operations asynchronous by default. It’s impossible to accidentally block the main thread with DataStore because the API simply doesn’t offer synchronous methods. Use Preferences DataStore for simple key-value pairs and Proto DataStore for typed, structured data.

// DataStore β€” async by default, impossible to block main thread
val Context.dataStore by preferencesDataStore(name = "settings")

class SettingsRepository(
    private val dataStore: DataStore<Preferences>
) {
    private val THEME_KEY = stringPreferencesKey("theme")
    private val NOTIFICATIONS_KEY = booleanPreferencesKey("notifications")

    val theme: Flow<String> = dataStore.data
        .map { preferences -> preferences[THEME_KEY] ?: "system" }
        .distinctUntilChanged()

    suspend fun setTheme(theme: String) {
        dataStore.edit { preferences ->
            preferences[THEME_KEY] = theme
        }
    }

    suspend fun setNotifications(enabled: Boolean) {
        dataStore.edit { preferences ->
            preferences[NOTIFICATIONS_KEY] = enabled
        }
    }
}

Key takeaway: Never block the main thread β€” use StrictMode in debug builds to catch violations. Replace SharedPreferences.commit() with apply(), or migrate to DataStore which is async by default. Any operation taking more than 1-2ms should run on a background dispatcher.

Quiz: Threading and Coroutines

What is the thread pool size of Dispatchers.Default?

  • ❌ 64 threads
  • ❌ 1 thread
  • βœ… Equal to the number of CPU cores
  • ❌ Unlimited

Dispatchers.Default is sized to match the number of CPU cores to prevent oversubscription. Having more threads than cores for CPU-bound work wastes cycles on context switching without improving throughput.

What does Dispatchers.Main.immediate do differently?

  • ❌ It runs on a higher priority thread
  • βœ… If already on the main thread, it executes immediately without redispatching
  • ❌ It skips the Choreographer
  • ❌ It uses a separate main thread

Dispatchers.Main.immediate avoids the overhead of posting to the message queue when you’re already on the main thread. Regular Dispatchers.Main always posts to the queue, adding latency even when the execution could happen immediately.

When should you use SupervisorJob?

  • ❌ When all child coroutines must succeed together
  • ❌ When you need faster execution
  • βœ… When child coroutine failures should be independent of each other
  • ❌ When you need to run on the main thread

SupervisorJob prevents one child’s failure from cancelling its siblings. Use it when loading independent sections of a screen β€” if the notifications section fails, the feed section should still work.

What does conflate do in a Flow?

  • ❌ Combines multiple flows into one
  • ❌ Buffers all values
  • βœ… Drops intermediate values when the collector is slower than the emitter
  • ❌ Adds delay between emissions

conflate tells the flow that if the collector can’t keep up with the emitter, intermediate values should be dropped and only the latest value delivered. This is ideal for high-frequency sensor data or animation values where only the current value matters.

Coding Challenge: Rate-Limited Parallel Processor

Build a coroutine-based parallel processor with configurable concurrency limits, retry logic, and progress reporting.

// Challenge: Build a ParallelProcessor that:
// 1. Processes items with configurable max concurrency
// 2. Retries failed items with exponential backoff
// 3. Reports progress (completed, failed, remaining)
// 4. Supports cancellation of remaining work
// 5. Returns results preserving original order

Solution

class ParallelProcessor<T, R>(
    private val maxConcurrency: Int = 4,
    private val maxRetries: Int = 3,
    private val initialRetryDelayMs: Long = 1000,
    private val process: suspend (T) -> R
) {
    data class Progress(
        val total: Int,
        val completed: Int,
        val failed: Int,
        val remaining: Int
    ) {
        val percentage: Int get() = if (total > 0) completed * 100 / total else 0
    }

    sealed class ItemResult<out R> {
        data class Success<R>(val value: R) : ItemResult<R>()
        data class Failure(val error: Throwable, val attempts: Int) : ItemResult<Nothing>()
    }

    suspend fun processAll(
        items: List<T>,
        onProgress: (Progress) -> Unit = {}
    ): List<ItemResult<R>> = coroutineScope {
        val semaphore = Semaphore(maxConcurrency)
        val completed = AtomicInteger(0)
        val failed = AtomicInteger(0)
        val total = items.size

        fun reportProgress() {
            val c = completed.get()
            val f = failed.get()
            onProgress(Progress(total, c, f, total - c - f))
        }

        items.map { item ->
            async(Dispatchers.Default) {
                semaphore.withPermit {
                    var lastError: Throwable? = null

                    repeat(maxRetries) { attempt ->
                        try {
                            val result = process(item)
                            completed.incrementAndGet()
                            reportProgress()
                            return@async ItemResult.Success(result)
                        } catch (e: CancellationException) {
                            throw e
                        } catch (e: Throwable) {
                            lastError = e
                            if (attempt < maxRetries - 1) {
                                val delayMs = initialRetryDelayMs * (1L shl attempt)
                                delay(delayMs.coerceAtMost(30_000))
                            }
                        }
                    }

                    failed.incrementAndGet()
                    reportProgress()
                    ItemResult.Failure(lastError!!, maxRetries)
                }
            }
        }.awaitAll()
    }
}

// Usage
suspend fun uploadPhotos(photos: List<Photo>) {
    val processor = ParallelProcessor<Photo, UploadResult>(
        maxConcurrency = 3,
        maxRetries = 3,
        process = { photo -> photoApi.upload(photo) }
    )

    val results = processor.processAll(photos) { progress ->
        Log.d("Upload", "${progress.percentage}% complete " +
            "(${progress.completed}/${progress.total})")
    }

    val successes = results.filterIsInstance<ParallelProcessor.ItemResult.Success<UploadResult>>()
    val failures = results.filterIsInstance<ParallelProcessor.ItemResult.Failure>()

    Log.d("Upload", "Done: ${successes.size} succeeded, ${failures.size} failed")
}

Module 8: APK Size and R8 Optimization

APK size directly impacts user acquisition, update frequency, and storage pressure. Google’s data shows that every 6MB increase reduces install conversion by approximately 1%. In emerging markets where storage and bandwidth are constrained, the impact is even larger. This module covers every technique for shrinking your APK β€” from R8 code optimization to resource shrinking, dynamic delivery, and Android App Bundles.

Lesson 8.1: Understanding APK Composition

Before optimizing APK size, you need to know where the bytes are. The APK Analyzer in Android Studio (Build β†’ Analyze APK) breaks down your APK into its constituent parts: DEX files (compiled code), resources (layouts, drawables, strings), native libraries (.so files), and assets. Typically, native libraries and images are the largest contributors, followed by DEX code.

The APK Analyzer shows raw size (bytes in the APK, compressed) and download size (estimated bytes transferred during install). These differ because the Play Store applies additional compression. Focus on download size for user-facing impact, but raw size for storage impact on the device.

// Analyzing APK size with command-line tools
// apkanalyzer -h  # Android SDK tool
// apkanalyzer apk file-size app-release.apk
// apkanalyzer apk download-size app-release.apk
// apkanalyzer dex packages app-release.apk --defined-only

// Use build config to track APK size in CI
// build.gradle.kts
android {
    applicationVariants.all {
        outputs.all {
            val output = this as com.android.build.gradle.internal.api.BaseVariantOutputImpl
            // Custom APK name with version for tracking
            output.outputFileName = "app-${versionName}-${buildType.name}.apk"
        }
    }
}

// CI script to fail build if APK exceeds size budget
// #!/bin/bash
// MAX_SIZE_MB=20
// APK_SIZE=$(stat -f%z app/build/outputs/apk/release/app-release.apk)
// APK_SIZE_MB=$((APK_SIZE / 1024 / 1024))
// if [ "$APK_SIZE_MB" -gt "$MAX_SIZE_MB" ]; then
//     echo "APK size ${APK_SIZE_MB}MB exceeds budget ${MAX_SIZE_MB}MB"
//     exit 1
// fi

DEX files contain your compiled Kotlin/Java code and all library code. Modern apps easily exceed 100,000 methods, requiring multiple DEX files (multidex). Each method reference, field reference, and string constant contributes to DEX size. Unused library code is the largest source of DEX bloat β€” you might import a library for one function but include all 10,000 of its methods in your APK.

Resources are the second major contributor. High-resolution images in drawable-xxxhdpi, unused layouts from library dependencies, multiple language translations you don’t need β€” all contribute to APK size. Android Studio’s Lint inspection β€œUnused resources” identifies resources that aren’t referenced from code, but it can’t detect resources accessed dynamically through getIdentifier().

Key takeaway: Use APK Analyzer to understand where bytes are spent. Native libraries and images typically dominate APK size. Set an APK size budget and enforce it in CI. Track size trends across releases to catch regressions early.

Lesson 8.2: R8 Code Shrinking and Optimization

R8 is Android’s code shrinker, optimizer, and obfuscator. It’s the successor to ProGuard and is integrated into the Android Gradle Plugin. When enabled, R8 performs three critical optimizations: shrinking (removing unused code), optimization (simplifying and inlining code), and obfuscation (renaming classes and methods to shorter names). Together, these can reduce DEX size by 30-60%.

Shrinking is the most impactful optimization. R8 performs whole-program analysis starting from your app’s entry points (Activities, Services, BroadcastReceivers declared in the manifest) and traces all reachable code. Anything not reachable is removed. This eliminates unused library code, unused methods, and even unused fields from classes you do use.

// build.gradle.kts β€” enable R8
android {
    buildTypes {
        release {
            isMinifyEnabled = true      // Enable R8 shrinking
            isShrinkResources = true    // Enable resource shrinking
            proguardFiles(
                getDefaultProguardFile("proguard-android-optimize.txt"),
                "proguard-rules.pro"
            )
        }
    }
}

// proguard-rules.pro β€” essential keep rules

# Keep entry points
-keep class * extends android.app.Activity
-keep class * extends android.app.Service
-keep class * extends android.content.BroadcastReceiver
-keep class * extends android.content.ContentProvider

# Keep Kotlin serialization
-keepattributes *Annotation*, InnerClasses
-dontnote kotlinx.serialization.AnnotationsKt

-keepclassmembers @kotlinx.serialization.Serializable class ** {
    *** Companion;
    *** INSTANCE;
    kotlinx.serialization.KSerializer serializer(...);
}

# Keep Retrofit interfaces
-keep,allowobfuscation interface * {
    @retrofit2.http.* <methods>;
}

# Keep data classes used with Moshi/Gson reflection
# (not needed with kotlinx.serialization or Moshi code-gen)
-keep class com.example.app.data.model.** { *; }

# Keep Room entities
-keep @androidx.room.Entity class * { *; }

R8’s optimization pass goes beyond just removing code. It inlines small methods (eliminating call overhead), merges classes that are always used together, removes unnecessary null checks, simplifies control flow, and devirtualizes method calls when it can prove the concrete type. The proguard-android-optimize.txt default file enables these optimizations.

Obfuscation renames classes, methods, and fields to single-letter names (a, b, c), reducing the size of string references in the DEX file. A class named com.example.app.feature.user.UserProfileRepository becomes a.a.a.a.a.a β€” much fewer bytes. R8 generates a mapping file (mapping.txt) that maps obfuscated names back to original names for stack trace deobfuscation.

// R8 optimization examples β€” what R8 does to your code

// Before R8:
class UserRepository(private val api: UserApi) {
    private val unused = "This string is never read"

    fun getUser(id: String): User {
        val validator = InputValidator()
        validator.validate(id)
        return api.fetchUser(id)
    }

    private fun neverCalledMethod() {
        // This entire method is removed
        Log.d("TAG", "Dead code")
    }
}

class InputValidator {
    fun validate(input: String): Boolean = input.isNotBlank()
}

// After R8 (conceptual):
class a(private val b: b) {
    fun a(c: String): User {
        // InputValidator.validate() inlined:
        require(c.isNotBlank())
        return b.a(c)
    }
    // neverCalledMethod removed
    // unused field removed
}

Key takeaway: Enable R8 with isMinifyEnabled = true for release builds β€” it removes unused code, optimizes bytecode, and obfuscates names, reducing DEX size by 30-60%. Write precise keep rules to prevent R8 from removing code accessed through reflection. Always test release builds thoroughly.

Lesson 8.3: Resource Optimization

Resources often account for 40-60% of APK size, with images being the dominant contributor. Optimizing resources involves using efficient formats, removing unused resources, and providing only the density-specific resources your users need.

Vector drawables replace multiple density-specific PNG/WebP files with a single scalable XML file. A vector drawable is typically 1-5KB regardless of display density, while providing pixel-perfect rendering at any size. Compare this to providing PNG files for mdpi, hdpi, xhdpi, xxhdpi, and xxxhdpi β€” that’s 5 files totaling 50-200KB for a single icon. For icons and simple illustrations, vector drawables reduce size by 90%+.

// Vector drawable β€” single file, all densities
// res/drawable/ic_search.xml β€” typically 1-3KB
// <vector xmlns:android="http://schemas.android.com/apk/res/android"
//     android:width="24dp"
//     android:height="24dp"
//     android:viewportWidth="24"
//     android:viewportHeight="24">
//     <path
//         android:fillColor="#FF000000"
//         android:pathData="M15.5,14h-0.79l-0.28,-0.27..."/>
// </vector>

// WebP conversion β€” 25-35% smaller than PNG with same quality
// Use Android Studio: right-click drawable β†’ Convert to WebP

// Resource shrinking removes unused resources
// build.gradle.kts
android {
    buildTypes {
        release {
            isShrinkResources = true // Requires isMinifyEnabled = true
        }
    }
}

// Keep specific resources that are loaded dynamically
// res/raw/keep.xml
// <?xml version="1.0" encoding="utf-8"?>
// <resources xmlns:tools="http://schemas.android.com/tools"
//     tools:keep="@drawable/dynamic_*,@raw/config"
//     tools:discard="@drawable/unused_*" />

Language filtering removes translations you don’t support. If your app only supports English and Spanish, there’s no reason to include the 70+ language translations that your libraries (AndroidX, Material Components) bundle. The resConfigs setting in your build file filters out unwanted localizations.

Density-specific resource filtering works similarly. If you only target phones (not tablets or watches), you probably don’t need ldpi or tvdpi resources. Filtering these out saves space.

// build.gradle.kts β€” filter resources
android {
    defaultConfig {
        // Only include these language resources
        resourceConfigurations += listOf("en", "es")

        // If you want to limit density resources
        // resourceConfigurations += listOf("xxhdpi", "xxxhdpi")
    }
}

// Optimize PNG files with aapt2 crunch (enabled by default)
// Or use pngquant for lossy compression (60-80% smaller):
// pngquant --quality=65-80 --output optimized.png input.png

// Remove unused resources with Lint
// Run: Analyze β†’ Run Inspection by Name β†’ "Unused resources"
// Or from command line:
// ./gradlew lintRelease

Key takeaway: Replace raster images with vector drawables for icons (90%+ size reduction). Convert PNGs to WebP (25-35% savings). Enable resource shrinking with isShrinkResources = true. Filter languages and densities with resConfigs to remove unused library translations.

Lesson 8.4: Android App Bundle and Dynamic Delivery

Android App Bundle (AAB) is Google’s publishing format that replaces the monolithic APK with a modular format. Instead of shipping a single APK with resources for every screen density, CPU architecture, and language, the Play Store generates optimized APKs for each device configuration. This typically reduces download size by 15-30% compared to a universal APK.

The size savings come from three configuration splits: density (only xxhdpi resources for xxhdpi devices), ABI (only arm64-v8a libraries for ARM64 devices), and language (only English resources for English-speaking users). The user downloads only the splits relevant to their device, dramatically reducing download and installation size.

// build.gradle.kts β€” configure App Bundle splits
android {
    bundle {
        language {
            enableSplit = true // Each language is a separate split
        }
        density {
            enableSplit = true // Each density is a separate split
        }
        abi {
            enableSplit = true // Each CPU architecture is a separate split
        }
    }
}

// Dynamic Feature Modules β€” install features on demand
// settings.gradle.kts
// include(":app", ":feature_camera", ":feature_ar")

// feature_camera/build.gradle.kts
// plugins {
//     id("com.android.dynamic-feature")
// }
// android {
//     // ...
// }
// dependencies {
//     implementation(project(":app"))
// }

// Install dynamic feature on demand
class CameraFeatureManager(private val context: Context) {
    private val splitInstallManager = SplitInstallManagerFactory.create(context)

    fun installCameraFeature(
        onSuccess: () -> Unit,
        onFailure: (Exception) -> Unit,
        onProgress: (Int) -> Unit
    ) {
        val request = SplitInstallRequest.newBuilder()
            .addModule("feature_camera")
            .build()

        splitInstallManager.startInstall(request)
            .addOnSuccessListener { sessionId ->
                // Monitor installation progress
                splitInstallManager.registerListener { state ->
                    when (state.status()) {
                        SplitInstallSessionStatus.INSTALLED -> {
                            onSuccess()
                        }
                        SplitInstallSessionStatus.DOWNLOADING -> {
                            val progress = (state.bytesDownloaded() * 100 /
                                state.totalBytesToDownload()).toInt()
                            onProgress(progress)
                        }
                        SplitInstallSessionStatus.FAILED -> {
                            onFailure(Exception("Install failed: ${state.errorCode()}"))
                        }
                        else -> {}
                    }
                }
            }
            .addOnFailureListener { exception ->
                onFailure(exception)
            }
    }

    fun isCameraFeatureInstalled(): Boolean {
        return splitInstallManager.installedModules.contains("feature_camera")
    }
}

Dynamic Feature Modules take this further by allowing features to be downloaded on demand after initial installation. Instead of shipping your entire app in the initial download, ship only the core experience. When the user wants to access the camera feature, AR feature, or any optional capability, download that module on demand. This can reduce initial download size by 50% or more for feature-rich apps.

The tradeoff is complexity. Dynamic features require careful module boundary design, and accessing code across module boundaries requires reflection or the SplitCompat library. Test the install flow thoroughly, including error cases (network failure, insufficient storage, cancelled installation).

Key takeaway: Publish as Android App Bundle for automatic 15-30% size reduction through configuration splits. Use Dynamic Feature Modules for optional features to reduce initial download size. Test the dynamic delivery flow thoroughly, including offline and error scenarios.

Lesson 8.5: Native Library Optimization

Native libraries (.so files) are often the largest single contributor to APK size. A single native library compiled for all four Android ABIs (armeabi-v7a, arm64-v8a, x86, x86_64) includes four copies of the same code. Since arm64-v8a covers the vast majority of modern devices and x86/x86_64 are primarily for emulators, you can often eliminate two or three ABIs.

When publishing as an App Bundle, ABI splitting handles this automatically β€” each device gets only its required ABI. But if you’re distributing APKs directly (enterprise distribution, sideloading), you need to configure ABI filters manually to avoid shipping all architectures.

// build.gradle.kts β€” ABI filtering for APK builds
android {
    defaultConfig {
        ndk {
            // Only include these architectures
            abiFilters += listOf("arm64-v8a", "armeabi-v7a")
            // x86 and x86_64 are emulator-only β€” exclude from release
        }
    }

    // Or use splits for multiple APKs
    splits {
        abi {
            isEnable = true
            reset()
            include("arm64-v8a", "armeabi-v7a")
            isUniversalApk = false // Don't generate universal APK
        }
    }
}

// Strip debug symbols from native libraries
// build.gradle.kts
android {
    packaging {
        jniLibs {
            useLegacyPackaging = false // Compress .so files in APK
        }
    }
}

// For NDK builds β€” strip symbols at build time
// Android.mk or CMakeLists.txt:
// set(CMAKE_C_FLAGS_RELEASE "${CMAKE_C_FLAGS_RELEASE} -s")
// set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -s")

Debug symbols in native libraries add significant size β€” sometimes doubling the .so file size. Strip debug symbols for release builds. If you need symbolicated crash reports (and you do), upload the debug symbols separately to your crash reporting service (Firebase Crashlytics, Sentry) rather than shipping them in the APK.

Evaluate whether you need each native library. Some libraries include native code for performance (image processing, crypto) that might have pure-Kotlin alternatives. If the native library is 5MB but the Kotlin alternative is 200KB and performance is acceptable, the trade-off is clear. Profile both to make an informed decision.

Key takeaway: Native libraries are often the largest APK component. Filter ABIs to include only arm64-v8a and armeabi-v7a for release builds. Strip debug symbols and upload them separately for crash reporting. Use App Bundle to automatically deliver only the needed ABI per device.

Quiz: APK Size Optimization

What is the typical APK size reduction from enabling R8?

  • ❌ 5-10%
  • βœ… 30-60% reduction in DEX size
  • ❌ 80-90%
  • ❌ R8 doesn’t affect size, only performance

R8 removes unused code (shrinking), simplifies remaining code (optimization), and renames identifiers to shorter names (obfuscation). Combined, these typically reduce DEX file size by 30-60%, with the exact savings depending on how many unused library dependencies your app includes.

What is the size benefit of Android App Bundle over APK?

  • ❌ 5% smaller
  • βœ… 15-30% smaller download through configuration splits
  • ❌ 50-70% smaller
  • ❌ No size benefit β€” only distribution benefit

App Bundle tells the Play Store to generate device-specific APKs with only the relevant density, ABI, and language resources. This eliminates resources the device doesn’t need, typically saving 15-30% on download size.

Why should you use vector drawables instead of PNGs for icons?

  • ❌ Vectors render faster on all devices
  • βœ… A single vector file replaces 5+ density-specific PNGs, saving 90%+ size
  • ❌ PNGs aren’t supported on modern Android
  • ❌ Vectors support animation natively

A vector drawable is a single 1-5KB XML file that renders perfectly at any density. Without vectors, you need separate PNG files for mdpi, hdpi, xhdpi, xxhdpi, and xxxhdpi β€” easily 50-200KB total for one icon. This adds up quickly across hundreds of icons.

What does resConfigs do in the build configuration?

  • ❌ Configures resolution settings
  • ❌ Sets the default screen density
  • βœ… Filters resource configurations to include only specified languages/densities
  • ❌ Compresses resources

resConfigs tells the build system to include only the specified resource configurations. Setting resConfigs("en", "es") removes all other language translations from AndroidX and other libraries, which can save several megabytes.

Coding Challenge: APK Size Analyzer

Build a Gradle task that analyzes APK composition, compares against a size budget, and reports the breakdown by component type.

// Challenge: Create an ApkSizeReport that:
// 1. Parses APK contents by file type (dex, native, resources, assets)
// 2. Identifies the top 10 largest files
// 3. Compares total size against a configurable budget
// 4. Compares against a previous build for regression detection
// 5. Outputs a human-readable report

Solution

import java.io.File
import java.util.zip.ZipFile

class ApkSizeAnalyzer(
    private val budgetBytes: Long = 20L * 1024 * 1024
) {
    data class FileEntry(
        val path: String,
        val compressedSize: Long,
        val uncompressedSize: Long
    ) {
        val category: String get() = when {
            path.endsWith(".dex") -> "DEX (code)"
            path.endsWith(".so") -> "Native libraries"
            path.startsWith("res/") -> "Resources"
            path.startsWith("assets/") -> "Assets"
            path == "resources.arsc" -> "Resource table"
            path.startsWith("META-INF/") -> "Signing"
            else -> "Other"
        }

        val compressionRatio: Double get() =
            if (uncompressedSize > 0) compressedSize.toDouble() / uncompressedSize else 1.0
    }

    data class SizeReport(
        val totalCompressed: Long,
        val totalUncompressed: Long,
        val byCategory: Map<String, Long>,
        val topFiles: List<FileEntry>,
        val budgetBytes: Long,
        val isOverBudget: Boolean,
        val overBudgetBy: Long
    )

    fun analyze(apkFile: File): SizeReport {
        val entries = mutableListOf<FileEntry>()

        ZipFile(apkFile).use { zip ->
            zip.entries().asSequence().forEach { entry ->
                entries.add(FileEntry(
                    path = entry.name,
                    compressedSize = entry.compressedSize,
                    uncompressedSize = entry.size
                ))
            }
        }

        val totalCompressed = apkFile.length()
        val totalUncompressed = entries.sumOf { it.uncompressedSize }

        val byCategory = entries
            .groupBy { it.category }
            .mapValues { (_, files) -> files.sumOf { it.compressedSize } }
            .toSortedMap()

        val topFiles = entries
            .sortedByDescending { it.compressedSize }
            .take(10)

        return SizeReport(
            totalCompressed = totalCompressed,
            totalUncompressed = totalUncompressed,
            byCategory = byCategory,
            topFiles = topFiles,
            budgetBytes = budgetBytes,
            isOverBudget = totalCompressed > budgetBytes,
            overBudgetBy = (totalCompressed - budgetBytes).coerceAtLeast(0)
        )
    }

    fun formatReport(report: SizeReport, previousReport: SizeReport? = null): String =
        buildString {
            appendLine("═══════════════════════════════════════")
            appendLine("           APK SIZE REPORT")
            appendLine("═══════════════════════════════════════")
            appendLine()

            val totalMB = report.totalCompressed / (1024.0 * 1024)
            val budgetMB = report.budgetBytes / (1024.0 * 1024)
            appendLine("Total: ${"%.2f".format(totalMB)}MB / ${"%.2f".format(budgetMB)}MB budget")

            if (report.isOverBudget) {
                val overMB = report.overBudgetBy / (1024.0 * 1024)
                appendLine("⚠️  OVER BUDGET by ${"%.2f".format(overMB)}MB")
            } else {
                appendLine("βœ… Within budget")
            }

            if (previousReport != null) {
                val delta = report.totalCompressed - previousReport.totalCompressed
                val deltaKB = delta / 1024.0
                val symbol = if (delta > 0) "πŸ“ˆ +" else "πŸ“‰ "
                appendLine("$symbol${"%.1f".format(deltaKB)}KB vs previous build")
            }

            appendLine()
            appendLine("By Category:")
            report.byCategory.forEach { (category, size) ->
                val sizeMB = size / (1024.0 * 1024)
                val percent = size * 100 / report.totalCompressed
                appendLine("  $category: ${"%.2f".format(sizeMB)}MB ($percent%)")
            }

            appendLine()
            appendLine("Top 10 Largest Files:")
            report.topFiles.forEachIndexed { index, file ->
                val sizeKB = file.compressedSize / 1024.0
                appendLine("  ${index + 1}. ${file.path}: ${"%.1f".format(sizeKB)}KB")
            }
        }
}

Module 9: Battery and Energy Efficiency

Battery life is the resource users care about most but developers think about least. A battery-draining app gets uninstalled faster than a slow app. Android’s battery management has become increasingly aggressive β€” Doze mode, App Standby Buckets, and background execution limits all restrict what your app can do in the background. Working with these systems instead of against them is essential for both user experience and app survival on the Play Store.

Lesson 9.1: Understanding Battery Drain

Battery drain comes from four primary sources: CPU usage (processing data, running computations), network activity (radio is one of the most power-hungry components), GPS and sensors (location tracking, accelerometer), and screen (wake locks that keep the display on). The Energy Profiler in Android Studio shows the relative energy impact of each component over time.

The cellular radio is particularly expensive because of its state machine. The radio has three states: idle (lowest power), low power (monitoring), and high power (actively transmitting). Transitioning from idle to high power takes 2-3 seconds and the radio stays in high power for 10-15 seconds after the last data transfer. This means a single small request keeps the radio in high-power mode for 15+ seconds, consuming significant battery even though the actual data transfer took milliseconds.

// Monitor battery drain sources
class BatteryMonitor(private val context: Context) {

    fun getBatteryStatus(): BatteryStatus {
        val batteryManager = context.getSystemService(
            Context.BATTERY_SERVICE
        ) as BatteryManager

        return BatteryStatus(
            level = batteryManager.getIntProperty(
                BatteryManager.BATTERY_PROPERTY_CAPACITY
            ),
            isCharging = batteryManager.isCharging,
            temperature = batteryManager.getIntProperty(
                BatteryManager.BATTERY_PROPERTY_CURRENT_NOW
            )
        )
    }

    data class BatteryStatus(
        val level: Int,
        val isCharging: Boolean,
        val temperature: Int
    )
}

// Batch network requests to minimize radio wake-ups
class NetworkBatcher(
    private val scope: CoroutineScope,
    private val api: ApiService
) {
    private val pendingRequests = mutableListOf<PendingRequest>()
    private var batchJob: Job? = null

    fun enqueue(request: PendingRequest) {
        synchronized(pendingRequests) {
            pendingRequests.add(request)
        }

        // Batch requests β€” wait 30 seconds to collect more
        if (batchJob == null) {
            batchJob = scope.launch {
                delay(30_000) // Wait to batch more requests

                val batch = synchronized(pendingRequests) {
                    val copy = pendingRequests.toList()
                    pendingRequests.clear()
                    copy
                }

                // Single network call for all batched requests
                api.sendBatch(batch)
                batchJob = null
            }
        }
    }
}

The Energy Profiler shows energy consumption broken down by component. Spikes in the network trace correlate with radio wake-ups. Sustained CPU activity shows computation that could be deferred. Wake lock acquisitions keep the CPU (or screen) active when the system wants to sleep. Use the Energy Profiler to identify which component is consuming the most energy and optimize that first.

Background execution limits on Android 8+ restrict what apps can do when not visible. You can’t start background services freely, location updates are throttled, and broadcast receivers for implicit broadcasts are restricted. These limits exist because battery-draining apps overwhelmingly do their damage in the background. Embrace these constraints by using WorkManager for deferrable work, foreground services for user-visible operations, and exact alarms only when timing precision is critical.

Key takeaway: The cellular radio consumes significant battery even for small requests due to its state machine. Batch network requests to minimize radio wake-ups. Use the Energy Profiler to identify the dominant battery drain source. Work within background execution limits β€” they exist to protect users.

Lesson 9.2: WorkManager and Efficient Scheduling

WorkManager is the recommended API for deferrable, guaranteed background work on Android. It handles API-level differences, respects battery optimization modes (Doze, App Standby), and guarantees execution even if the app is killed or the device restarts. Use it for sync operations, log uploads, periodic data refresh, and any task that doesn’t need to complete immediately.

WorkManager supports constraints that delay work until conditions are favorable for battery life. Wait until the device is charging and on Wi-Fi before uploading large files. Wait until the device is idle before running maintenance tasks. These constraints dramatically reduce battery impact by aligning work with times when energy is abundant (charging) and cheap (Wi-Fi vs cellular).

// WorkManager with battery-friendly constraints
class SyncWorker(
    context: Context,
    params: WorkerParameters
) : CoroutineWorker(context, params) {

    override suspend fun doWork(): Result {
        return try {
            val syncResult = syncRepository.performFullSync()
            if (syncResult.isSuccess) Result.success()
            else Result.retry()
        } catch (e: Exception) {
            if (runAttemptCount < 3) Result.retry()
            else Result.failure()
        }
    }

    companion object {
        fun enqueuePeriodicSync(context: Context) {
            val constraints = Constraints.Builder()
                .setRequiredNetworkType(NetworkType.UNMETERED) // Wi-Fi only
                .setRequiresBatteryNotLow(true)  // Don't drain low battery
                .setRequiresCharging(false)       // Ok even when not charging
                .setRequiresStorageNotLow(true)   // Need space for data
                .build()

            val syncRequest = PeriodicWorkRequestBuilder<SyncWorker>(
                repeatInterval = 1,
                repeatIntervalTimeUnit = TimeUnit.HOURS,
                flexInterval = 15,
                flexTimeUnit = TimeUnit.MINUTES
            )
                .setConstraints(constraints)
                .setBackoffCriteria(
                    BackoffPolicy.EXPONENTIAL,
                    10, TimeUnit.MINUTES
                )
                .addTag("periodic_sync")
                .build()

            WorkManager.getInstance(context).enqueueUniquePeriodicWork(
                "periodic_sync",
                ExistingPeriodicWorkPolicy.KEEP,
                syncRequest
            )
        }

        fun enqueueOneTimeUpload(
            context: Context,
            filePath: String
        ) {
            val constraints = Constraints.Builder()
                .setRequiredNetworkType(NetworkType.CONNECTED)
                .setRequiresBatteryNotLow(true)
                .build()

            val uploadData = workDataOf("file_path" to filePath)

            val uploadRequest = OneTimeWorkRequestBuilder<UploadWorker>()
                .setConstraints(constraints)
                .setInputData(uploadData)
                .setExpedited(OutOfQuotaPolicy.RUN_AS_NON_EXPEDITED_WORK_REQUEST)
                .build()

            WorkManager.getInstance(context).enqueue(uploadRequest)
        }
    }
}

Chaining work requests allows you to express complex workflows where tasks depend on each other. Download, then process, then upload β€” each step runs only when the previous one succeeds. WorkManager handles retry logic, constraint satisfaction, and lifecycle management for the entire chain.

// Work chain β€” download β†’ process β†’ upload
fun scheduleFullPipeline(context: Context) {
    val workManager = WorkManager.getInstance(context)

    val downloadWork = OneTimeWorkRequestBuilder<DownloadWorker>()
        .setConstraints(
            Constraints.Builder()
                .setRequiredNetworkType(NetworkType.CONNECTED)
                .build()
        )
        .build()

    val processWork = OneTimeWorkRequestBuilder<ProcessWorker>()
        .build() // No constraints β€” runs immediately after download

    val uploadWork = OneTimeWorkRequestBuilder<UploadWorker>()
        .setConstraints(
            Constraints.Builder()
                .setRequiredNetworkType(NetworkType.UNMETERED)
                .setRequiresCharging(true) // Wait for charger
                .build()
        )
        .build()

    workManager.beginWith(downloadWork)
        .then(processWork)
        .then(uploadWork)
        .enqueue()
}

Key takeaway: Use WorkManager for all deferrable background work. Set constraints to align work with battery-friendly conditions (charging, Wi-Fi, device idle). Use exponential backoff for retries. Chain dependent work requests for multi-step pipelines.

Lesson 9.3: Wake Locks and Foreground Services

Wake locks keep the CPU (or screen) active when the system wants to sleep. They’re essential for operations that must complete without interruption β€” audio playback, active navigation, file downloads. But wake locks that aren’t released properly are the number one cause of battery drain in production apps. A leaked wake lock keeps the CPU running 24/7, draining the battery in hours instead of days.

The rule is simple: acquire wake locks for the shortest possible duration and always release them. Use try/finally blocks or Kotlin’s use pattern to guarantee release. Never hold a wake lock longer than necessary. The system’s battery stats (Settings β†’ Battery β†’ App usage) shows which apps hold wake locks and for how long β€” users will identify and uninstall wake lock abusers.

// βœ… Safe wake lock usage with automatic release
class CriticalOperation(private val context: Context) {

    fun performCriticalWork() {
        val powerManager = context.getSystemService(
            Context.POWER_SERVICE
        ) as PowerManager

        val wakeLock = powerManager.newWakeLock(
            PowerManager.PARTIAL_WAKE_LOCK,
            "MyApp::CriticalOperation"
        ).apply {
            // ALWAYS set a timeout β€” prevents leaked wake locks
            acquire(10 * 60 * 1000L) // 10 minute maximum
        }

        try {
            // Do critical work
            performWork()
        } finally {
            if (wakeLock.isHeld) {
                wakeLock.release()
            }
        }
    }
}

// Foreground service β€” the modern way to do long-running work
class DownloadService : Service() {

    override fun onStartCommand(intent: Intent?, flags: Int, startId: Int): Int {
        val notification = createNotification("Downloading...")

        startForeground(NOTIFICATION_ID, notification)

        CoroutineScope(Dispatchers.IO + SupervisorJob()).launch {
            try {
                performDownload()
            } finally {
                stopForeground(STOP_FOREGROUND_REMOVE)
                stopSelf()
            }
        }

        return START_NOT_STICKY
    }

    private fun createNotification(message: String): Notification {
        return NotificationCompat.Builder(this, CHANNEL_ID)
            .setContentTitle("Download")
            .setContentText(message)
            .setSmallIcon(R.drawable.ic_download)
            .setOngoing(true)
            .build()
    }

    override fun onBind(intent: Intent?): IBinder? = null

    companion object {
        private const val NOTIFICATION_ID = 1
        private const val CHANNEL_ID = "downloads"
    }
}

Android 12+ requires foreground service type declarations in the manifest. Each foreground service must declare its type (location, camera, microphone, dataSync, etc.), and the system enforces that the service only accesses the permissions it declares. This transparency helps users understand why a service is running and what resources it’s using.

Use foreground services sparingly. They require a visible notification, consume battery, and are visible to users. For most background tasks, WorkManager is more appropriate because it works within the system’s power management policies. Reserve foreground services for user-initiated operations that require continuous execution β€” music playback, navigation, active file transfer.

Key takeaway: Always set timeouts on wake locks and release them in finally blocks. Use foreground services for user-visible ongoing operations. Use WorkManager for deferrable background work. Declare foreground service types on Android 12+.

Lesson 9.4: Doze Mode and App Standby

Doze mode (introduced in Android 6) dramatically restricts background activity when the device is stationary, unplugged, and screen-off for an extended period. During Doze, the system defers alarms, network access, jobs, and syncs until periodic maintenance windows. Understanding Doze is essential because your app’s background behavior changes fundamentally when Doze is active.

Doze has two phases. Light Doze activates soon after the screen turns off β€” it restricts network access and defers jobs but still runs them periodically. Deep Doze activates after extended inactivity β€” it batches all deferred work into infrequent maintenance windows that become progressively longer (from 1 hour to several hours apart).

// Check Doze status
fun isDozeModeActive(context: Context): Boolean {
    val powerManager = context.getSystemService(
        Context.POWER_SERVICE
    ) as PowerManager

    return if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M) {
        powerManager.isDeviceIdleMode
    } else {
        false
    }
}

// Listen for Doze mode changes
class DozeReceiver : BroadcastReceiver() {
    override fun onReceive(context: Context, intent: Intent) {
        val powerManager = context.getSystemService(
            Context.POWER_SERVICE
        ) as PowerManager

        if (powerManager.isDeviceIdleMode) {
            // Device entered Doze β€” stop non-essential background work
            Log.d("Doze", "Doze mode active β€” pausing background work")
        } else {
            // Doze exited β€” can resume background work
            Log.d("Doze", "Doze mode ended β€” resuming background work")
        }
    }
}

// Register in manifest:
// <receiver android:name=".DozeReceiver">
//     <intent-filter>
//         <action android:name="android.os.action.DEVICE_IDLE_MODE_CHANGED" />
//     </intent-filter>
// </receiver>

App Standby Buckets (Android 9+) categorize apps by usage frequency and restrict background activity accordingly. Active apps (currently in use) have no restrictions. Working Set apps (used recently) have mild restrictions. Frequent apps have moderate restrictions. Rare apps (haven’t been used in weeks) have severe restrictions β€” jobs and alarms are heavily deferred. Restricted apps (Android 12+) can barely do anything in the background.

Your app’s bucket is determined by how frequently and recently the user interacts with it. You can’t control your bucket directly, but you can design your app to work within bucket constraints. Use WorkManager, which respects bucket restrictions automatically. Avoid relying on exact timing for background work β€” the system will defer it based on your bucket.

// Check your app's standby bucket
fun getStandbyBucket(context: Context): String {
    if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.P) {
        val usageStatsManager = context.getSystemService(
            Context.USAGE_STATS_SERVICE
        ) as UsageStatsManager

        return when (usageStatsManager.appStandbyBucket) {
            UsageStatsManager.STANDBY_BUCKET_ACTIVE -> "Active"
            UsageStatsManager.STANDBY_BUCKET_WORKING_SET -> "Working Set"
            UsageStatsManager.STANDBY_BUCKET_FREQUENT -> "Frequent"
            UsageStatsManager.STANDBY_BUCKET_RARE -> "Rare"
            UsageStatsManager.STANDBY_BUCKET_RESTRICTED -> "Restricted"
            else -> "Unknown"
        }
    }
    return "Pre-Pie device"
}

// Design for all buckets β€” degrade gracefully
class SyncScheduler(private val context: Context) {

    fun scheduleSyncAppropriately() {
        val constraints = Constraints.Builder()
            .setRequiredNetworkType(NetworkType.CONNECTED)
            .build()

        // Use periodic work β€” WorkManager respects bucket restrictions
        val syncRequest = PeriodicWorkRequestBuilder<SyncWorker>(
            repeatInterval = 1, TimeUnit.HOURS,
            flexInterval = 30, TimeUnit.MINUTES
        )
            .setConstraints(constraints)
            .build()

        WorkManager.getInstance(context).enqueueUniquePeriodicWork(
            "sync",
            ExistingPeriodicWorkPolicy.KEEP,
            syncRequest
        )
        // In Active bucket: runs every hour
        // In Rare bucket: system may defer to every 24 hours
        // WorkManager handles this transparently
    }
}

Key takeaway: Doze mode and App Standby Buckets defer background work based on device state and app usage frequency. Design background work to be deferrable using WorkManager. Don’t fight the system β€” work within power management constraints. Your app’s standby bucket determines how aggressively background work is restricted.

Lesson 9.5: Location and Sensor Optimization

Location tracking is one of the most battery-intensive operations on Android. GPS is the most accurate but also the most power-hungry β€” it can drain a full battery in 4-6 hours if used continuously at high accuracy. Network-based location (cell towers, Wi-Fi) is much less accurate but dramatically less expensive. The Fused Location Provider balances accuracy and battery automatically.

The key optimization is requesting only the accuracy and frequency you need. A weather app needs city-level accuracy (coarse location, updated once per hour). A navigation app needs street-level accuracy (fine location, updated every second). Most apps fall somewhere in between β€” requesting GPS-level accuracy for a weather app wastes battery without benefit.

// Battery-efficient location requests
class LocationTracker(private val context: Context) {

    private val fusedLocationClient = LocationServices
        .getFusedLocationProviderClient(context)

    // Low-power location β€” for features like weather, nearby stores
    fun startPassiveTracking(callback: (Location) -> Unit) {
        val request = LocationRequest.Builder(
            Priority.PRIORITY_BALANCED_POWER_ACCURACY,
            30 * 60 * 1000L // Every 30 minutes
        )
            .setMinUpdateDistanceMeters(500f) // Only if moved 500m
            .setMaxUpdateDelayMillis(60 * 60 * 1000L) // Batch for 1 hour
            .setWaitForAccurateLocation(false)
            .build()

        startLocationUpdates(request, callback)
    }

    // High-accuracy location β€” for navigation, fitness tracking
    fun startActiveTracking(callback: (Location) -> Unit) {
        val request = LocationRequest.Builder(
            Priority.PRIORITY_HIGH_ACCURACY,
            5000L // Every 5 seconds
        )
            .setMinUpdateDistanceMeters(10f) // Only if moved 10m
            .setMaxUpdateDelayMillis(10_000L) // Max 10s batch delay
            .build()

        startLocationUpdates(request, callback)
    }

    private fun startLocationUpdates(
        request: LocationRequest,
        callback: (Location) -> Unit
    ) {
        val locationCallback = object : LocationCallback() {
            override fun onLocationResult(result: LocationResult) {
                result.lastLocation?.let(callback)
            }
        }

        try {
            fusedLocationClient.requestLocationUpdates(
                request, locationCallback, Looper.getMainLooper()
            )
        } catch (e: SecurityException) {
            Log.e("Location", "Permission denied", e)
        }
    }

    fun stopTracking() {
        fusedLocationClient.removeLocationUpdates(locationCallback)
    }
}

For sensor data (accelerometer, gyroscope, proximity), the same principle applies β€” request only the sampling rate you need. SensorManager.SENSOR_DELAY_NORMAL (200ms) is sufficient for most UI interactions. SENSOR_DELAY_GAME (20ms) is for games and fitness tracking. SENSOR_DELAY_FASTEST (0ms) should almost never be used β€” it reads as fast as the hardware allows, burning through battery for data that’s usually unnecessary.

Always unregister sensor listeners when they’re not needed. A registered sensor listener consumes battery even when the app is in the background. Use lifecycle-aware components to automatically register in onResume() and unregister in onPause().

Key takeaway: Request only the location accuracy and frequency your feature actually needs. Use PRIORITY_BALANCED_POWER_ACCURACY unless you specifically need GPS precision. Unregister sensor listeners in onPause(). Batch location updates to reduce wake-ups.

Quiz: Battery Efficiency

What is the most battery-expensive network behavior?

  • ❌ Downloading large files
  • ❌ Using HTTPS instead of HTTP
  • βœ… Frequent small requests that keep the radio in high-power mode
  • ❌ Using WebSocket connections

The cellular radio stays in high-power mode for 10-15 seconds after each data transfer. Frequent small requests prevent the radio from returning to idle mode, consuming far more battery than a single large transfer of the same total data.

What happens during Deep Doze mode?

  • ❌ The device turns off completely
  • ❌ Apps continue running normally
  • βœ… Network access, jobs, and syncs are deferred to infrequent maintenance windows
  • ❌ Only foreground apps are affected

Deep Doze batches all background work into maintenance windows that become progressively less frequent β€” from every hour to every several hours. Between windows, network access is blocked, alarms are deferred, and jobs don’t run.

What should you always do with wake locks?

  • ❌ Hold them indefinitely for reliability
  • ❌ Acquire them in Application.onCreate()
  • βœ… Set a timeout and release them in a finally block
  • ❌ Use them instead of foreground services

Wake locks that aren’t released drain battery until the device is rebooted. Always set a maximum timeout with acquire(timeoutMs) and release in a finally block. Leaked wake locks are the single most common cause of excessive battery drain.

Coding Challenge: Battery-Aware Task Scheduler

Build a task scheduler that adapts its behavior based on battery level, charging state, and network type.

// Challenge: Create a BatteryAwareScheduler that:
// 1. Monitors battery level and charging state
// 2. Categorizes tasks by priority (critical, normal, deferrable)
// 3. Runs critical tasks immediately regardless of battery
// 4. Defers normal tasks when battery < 20%
// 5. Runs deferrable tasks only when charging

Solution

class BatteryAwareScheduler(
    private val context: Context,
    private val scope: CoroutineScope
) {
    enum class Priority { CRITICAL, NORMAL, DEFERRABLE }

    data class ScheduledTask(
        val id: String,
        val priority: Priority,
        val task: suspend () -> Unit
    )

    private val deferredTasks = mutableListOf<ScheduledTask>()
    private val batteryState = MutableStateFlow(getBatteryState())

    data class BatteryState(
        val level: Int,
        val isCharging: Boolean,
        val isLow: Boolean
    )

    init {
        // Monitor battery changes
        val filter = IntentFilter().apply {
            addAction(Intent.ACTION_BATTERY_CHANGED)
            addAction(Intent.ACTION_POWER_CONNECTED)
            addAction(Intent.ACTION_POWER_DISCONNECTED)
        }

        context.registerReceiver(object : BroadcastReceiver() {
            override fun onReceive(ctx: Context, intent: Intent) {
                batteryState.value = getBatteryState()
                processDeferredTasks()
            }
        }, filter)
    }

    suspend fun schedule(task: ScheduledTask) {
        when (task.priority) {
            Priority.CRITICAL -> {
                // Always execute immediately
                executeTask(task)
            }
            Priority.NORMAL -> {
                if (batteryState.value.isLow && !batteryState.value.isCharging) {
                    deferredTasks.add(task)
                } else {
                    executeTask(task)
                }
            }
            Priority.DEFERRABLE -> {
                if (batteryState.value.isCharging) {
                    executeTask(task)
                } else {
                    deferredTasks.add(task)
                }
            }
        }
    }

    private fun processDeferredTasks() {
        val state = batteryState.value
        val iterator = deferredTasks.iterator()

        while (iterator.hasNext()) {
            val task = iterator.next()
            val shouldRun = when (task.priority) {
                Priority.CRITICAL -> true
                Priority.NORMAL -> !state.isLow || state.isCharging
                Priority.DEFERRABLE -> state.isCharging
            }

            if (shouldRun) {
                iterator.remove()
                scope.launch { executeTask(task) }
            }
        }
    }

    private suspend fun executeTask(task: ScheduledTask) {
        try {
            task.task()
        } catch (e: Exception) {
            Log.e("Scheduler", "Task ${task.id} failed", e)
        }
    }

    private fun getBatteryState(): BatteryState {
        val batteryManager = context.getSystemService(
            Context.BATTERY_SERVICE
        ) as BatteryManager

        val level = batteryManager.getIntProperty(
            BatteryManager.BATTERY_PROPERTY_CAPACITY
        )

        return BatteryState(
            level = level,
            isCharging = batteryManager.isCharging,
            isLow = level <= 20
        )
    }

    fun getPendingTaskCount(): Int = deferredTasks.size
}

Module 10: Micro-Optimizations and Database Performance

This final module covers the fine-grained optimizations that squeeze the last drops of performance from your code. Micro-optimizations are controversial β€” premature optimization is the root of all evil, as Knuth said. But there’s a difference between premature optimization and informed optimization. When you’ve profiled your app, identified the hot path, and need that last 20% improvement, these techniques make the difference. This module also covers database performance with Room, which is where many apps spend the majority of their I/O time.

Lesson 10.1: Kotlin Compiler Optimizations

The Kotlin compiler generates JVM bytecode, and understanding what it generates helps you write code that compiles to efficient bytecode. Several Kotlin features have hidden costs that the compiler can’t always optimize away β€” knowing these lets you avoid them in performance-critical code.

inline functions are one of Kotlin’s most powerful performance features. When a function is marked inline, the compiler copies the function body to every call site, eliminating the function call overhead and β€” critically β€” the lambda allocation for function parameters. Without inline, every lambda creates an anonymous class instance. With inline, the lambda body is inlined directly into the caller.

// Without inline β€” lambda allocates an anonymous class every call
fun <T> measureTime(block: () -> T): T {
    val start = System.nanoTime()
    val result = block() // Allocates Function0 anonymous class
    val elapsed = System.nanoTime() - start
    Log.d("Perf", "Took ${elapsed}ns")
    return result
}

// With inline β€” no allocation, lambda body copied to call site
inline fun <T> measureTime(block: () -> T): T {
    val start = System.nanoTime()
    val result = block() // Body inlined directly β€” no allocation
    val elapsed = System.nanoTime() - start
    Log.d("Perf", "Took ${elapsed}ns")
    return result
}

// @JvmInline value classes β€” zero-overhead wrappers
@JvmInline
value class UserId(val value: String)

@JvmInline
value class Milliseconds(val value: Long) {
    fun toSeconds(): Double = value / 1000.0
}

// At runtime, UserId("abc") is just the String "abc"
// No wrapper object allocated β€” type safety at zero runtime cost
fun getUser(id: UserId): User {
    // id is just a String at runtime
    return database.getUser(id.value)
}

Reified type parameters solve a limitation of JVM generics. Due to type erasure, generic types are erased at runtime β€” List<String> and List<Int> are the same class at runtime. The reified keyword (only available in inline functions) preserves the type information, allowing you to use is checks and ::class references on generic types without reflection. This is both more convenient and more performant than passing Class<T> parameters.

// Without reified β€” requires Class parameter (verbose, reflection-based)
fun <T> parseJson(json: String, clazz: Class<T>): T {
    return moshi.adapter(clazz).fromJson(json)!!
}
// Usage: parseJson(json, User::class.java)

// With reified β€” type information available at runtime
inline fun <reified T> parseJson(json: String): T {
    return moshi.adapter(T::class.java).fromJson(json)!!
}
// Usage: parseJson<User>(json) β€” cleaner, same performance

// Practical use β€” type-safe navigation
inline fun <reified T : Activity> Context.startActivity(
    configureIntent: Intent.() -> Unit = {}
) {
    val intent = Intent(this, T::class.java).apply(configureIntent)
    startActivity(intent)
}
// Usage: startActivity<ProfileActivity> { putExtra("userId", id) }

Sealed classes and when expressions compile to efficient bytecode. The compiler generates a jump table for when expressions over sealed classes, which is O(1) instead of the O(n) chain of instanceof checks that if-else chains produce. Additionally, sealed class hierarchies enable the compiler to prove exhaustiveness at compile time, eliminating the need for a default branch.

Key takeaway: Use inline to eliminate lambda allocation overhead in higher-order functions. Use @JvmInline value class for type-safe wrappers with zero runtime cost. Use reified types to avoid passing Class<T> parameters. These compiler features provide type safety and abstraction at zero or near-zero performance cost.

Lesson 10.2: Data Structure Selection

Choosing the right data structure can make a 10x difference in performance. The standard library provides general-purpose collections, but for specific access patterns, specialized data structures are dramatically faster.

HashMap vs ArrayMap vs SparseArray β€” HashMap is the general-purpose choice with O(1) lookup, but it has high memory overhead per entry (each entry creates a Node object with hash, key, value, and next pointer). ArrayMap uses two arrays (hashes and key-value pairs) and binary search for lookups, making it slower for large maps (O(log n)) but more memory-efficient. SparseArray variants are the most efficient for integer keys β€” they avoid boxing integers and use sorted arrays.

// Memory comparison for 100 entries:
// HashMap<Int, String>: ~3600 bytes (Entry objects, boxing)
// ArrayMap<Int, String>: ~1200 bytes (two arrays)
// SparseArray<String>: ~800 bytes (int array, object array)

// Use SparseArray when keys are integers
val viewCache = SparseArray<View>()
viewCache.put(R.id.title, titleView)
viewCache.put(R.id.subtitle, subtitleView)
val view = viewCache.get(R.id.title)

// Use SparseBooleanArray instead of HashMap<Int, Boolean>
val checkedItems = SparseBooleanArray()
checkedItems.put(0, true)
checkedItems.put(5, true)
val isChecked = checkedItems.get(0) // No boxing!

// Use SparseIntArray instead of HashMap<Int, Int>
val scores = SparseIntArray()
scores.put(playerId, score) // No boxing of either key or value

// For String keys with small maps (<1000 entries), ArrayMap
// is more memory-efficient than HashMap
val config = ArrayMap<String, Any>().apply {
    put("theme", "dark")
    put("fontSize", 14)
    put("language", "en")
}

Array-based collections are faster for sequential access and more cache-friendly than linked structures. ArrayList stores elements in a contiguous array β€” iterating through it is fast because elements are adjacent in memory (cache-friendly). LinkedList stores each element in a separate node with pointers β€” iterating requires following pointers that may point anywhere in memory (cache-unfriendly). On modern CPUs where cache misses are 100x slower than cache hits, this difference is significant.

// Primitive arrays avoid boxing overhead
// IntArray vs Array<Int>
val primitiveArray = IntArray(1000) { it }   // No boxing β€” each int is 4 bytes
val boxedArray = Array(1000) { it }           // Boxing β€” each Int is ~16 bytes

// Iteration performance
fun sumPrimitive(array: IntArray): Long {
    var sum = 0L
    for (i in array.indices) {
        sum += array[i] // Direct memory access
    }
    return sum
}

fun sumBoxed(array: Array<Int>): Long {
    var sum = 0L
    for (element in array) {
        sum += element // Unboxing on every access
    }
    return sum
}
// Primitive version is 3-5x faster due to no unboxing and better cache behavior

// Efficient string building
val result = buildString(estimatedSize) {
    // Pre-sized StringBuilder avoids resizing
    for (item in items) {
        append(item.name)
        append(": ")
        append(item.value)
        appendLine()
    }
}

Key takeaway: Use SparseArray variants for integer keys to avoid boxing. Use ArrayMap instead of HashMap for small maps (<1000 entries). Prefer primitive arrays (IntArray, FloatArray) over boxed arrays. Choose ArrayList over LinkedList for cache-friendly sequential access.

Lesson 10.3: Room Database Performance

Room is the standard database abstraction for Android, wrapping SQLite with compile-time verification and Kotlin coroutines support. But Room’s convenience can mask performance problems β€” inefficient queries, missing indices, and improper transaction usage can make database operations the dominant bottleneck in your app.

Indices are the single most impactful database optimization. Without an index, SQLite performs a full table scan for every query β€” reading every row to find matches. With an index, SQLite uses a B-tree lookup that’s O(log n) instead of O(n). For a table with 100,000 rows, an indexed lookup is roughly 1,000x faster than a table scan.

// Room entity with proper indexing
@Entity(
    tableName = "messages",
    indices = [
        Index(value = ["conversation_id"]),
        Index(value = ["sender_id"]),
        Index(value = ["timestamp"]),
        Index(value = ["conversation_id", "timestamp"]) // Composite index
    ],
    foreignKeys = [
        ForeignKey(
            entity = Conversation::class,
            parentColumns = ["id"],
            childColumns = ["conversation_id"],
            onDelete = ForeignKey.CASCADE
        )
    ]
)
data class Message(
    @PrimaryKey val id: String,
    @ColumnInfo(name = "conversation_id") val conversationId: String,
    @ColumnInfo(name = "sender_id") val senderId: String,
    val content: String,
    val timestamp: Long,
    @ColumnInfo(name = "is_read") val isRead: Boolean = false
)

// DAO with efficient queries
@Dao
interface MessageDao {
    // βœ… Uses composite index (conversation_id, timestamp)
    @Query("""
        SELECT * FROM messages
        WHERE conversation_id = :conversationId
        ORDER BY timestamp DESC
        LIMIT :limit
    """)
    suspend fun getRecentMessages(
        conversationId: String,
        limit: Int = 50
    ): List<Message>

    // βœ… Count with index β€” no full table scan
    @Query("""
        SELECT COUNT(*) FROM messages
        WHERE conversation_id = :conversationId
        AND is_read = 0
    """)
    suspend fun getUnreadCount(conversationId: String): Int

    // βœ… Paging source for efficient large list display
    @Query("""
        SELECT * FROM messages
        WHERE conversation_id = :conversationId
        ORDER BY timestamp DESC
    """)
    fun getMessagesPaging(conversationId: String): PagingSource<Int, Message>

    // βœ… Return only needed columns β€” avoid SELECT *
    @Query("""
        SELECT id, content, timestamp FROM messages
        WHERE conversation_id = :conversationId
        ORDER BY timestamp DESC
        LIMIT :limit
    """)
    suspend fun getMessagePreviews(
        conversationId: String,
        limit: Int = 20
    ): List<MessagePreview>
}

data class MessagePreview(
    val id: String,
    val content: String,
    val timestamp: Long
)

Transactions are critical for batch operations. Without explicit transactions, each insert/update/delete operation runs in its own implicit transaction β€” committing to disk after every operation. For inserting 1000 rows, that’s 1000 disk writes. Wrapping them in a single transaction reduces it to one disk write, making batch operations 100-1000x faster.

// ❌ No explicit transaction β€” each insert commits separately
@Dao
interface SlowDao {
    @Insert
    suspend fun insert(message: Message) // 1 transaction per call

    // Inserting 1000 messages = 1000 disk commits
}

// βœ… Batch operations in a transaction
@Dao
interface FastDao {
    @Insert
    suspend fun insertAll(messages: List<Message>) // Room wraps in transaction

    // For complex operations, use @Transaction
    @Transaction
    suspend fun replaceConversation(
        conversationId: String,
        newMessages: List<Message>
    ) {
        deleteByConversation(conversationId)
        insertAll(newMessages)
        // Both operations in a single transaction
    }

    @Query("DELETE FROM messages WHERE conversation_id = :conversationId")
    suspend fun deleteByConversation(conversationId: String)
}

// Manual transaction for complex logic
class MessageRepository(private val database: AppDatabase) {

    suspend fun syncMessages(
        conversationId: String,
        serverMessages: List<Message>
    ) {
        database.withTransaction {
            val localMessages = database.messageDao()
                .getMessagesByConversation(conversationId)

            val localIds = localMessages.map { it.id }.toSet()
            val serverIds = serverMessages.map { it.id }.toSet()

            // Delete messages removed on server
            val toDelete = localMessages.filter { it.id !in serverIds }
            database.messageDao().deleteMessages(toDelete)

            // Insert new messages
            val toInsert = serverMessages.filter { it.id !in localIds }
            database.messageDao().insertAll(toInsert)

            // Update existing messages
            val toUpdate = serverMessages.filter { it.id in localIds }
            database.messageDao().updateAll(toUpdate)
        }
    }
}

Key takeaway: Add indices to every column used in WHERE clauses, ORDER BY clauses, and JOIN conditions. Wrap batch operations in transactions for 100-1000x speedup. Select only needed columns instead of SELECT *. Use Room’s PagingSource for large datasets.

Lesson 10.4: Query Optimization and EXPLAIN

Writing efficient SQL queries requires understanding how SQLite executes them. The EXPLAIN QUERY PLAN statement shows you how SQLite will execute a query β€” whether it uses an index, performs a table scan, or uses a temporary B-tree for sorting. This information tells you exactly which queries need optimization.

The most common query performance problems are missing indices (full table scans), implicit type conversions (index can’t be used), and N+1 queries (loading a list then loading related data one-by-one in a loop). Room’s compile-time query verification catches syntax errors but not performance problems β€” you need EXPLAIN QUERY PLAN for that.

// Debug query performance with EXPLAIN QUERY PLAN
@Dao
interface DebugDao {
    // Run this in Android Studio's Database Inspector:
    // EXPLAIN QUERY PLAN
    // SELECT * FROM messages WHERE conversation_id = 'abc' ORDER BY timestamp DESC

    // Good output (using index):
    // SEARCH TABLE messages USING INDEX index_messages_conversation_id_timestamp
    //   (conversation_id=?)

    // Bad output (table scan):
    // SCAN TABLE messages
    // USE TEMP B-TREE FOR ORDER BY
}

// N+1 query problem
// ❌ N+1 queries β€” loads conversations, then messages for EACH conversation
suspend fun loadConversationsWithPreviews(): List<ConversationWithPreview> {
    val conversations = conversationDao.getAll()  // 1 query
    return conversations.map { conversation ->
        val lastMessage = messageDao.getLastMessage(conversation.id)  // N queries
        ConversationWithPreview(conversation, lastMessage)
    }
    // Total: 1 + N queries for N conversations
}

// βœ… Single query with JOIN β€” Room @Relation
data class ConversationWithMessages(
    @Embedded val conversation: Conversation,
    @Relation(
        parentColumn = "id",
        entityColumn = "conversation_id"
    )
    val messages: List<Message>
)

@Dao
interface ConversationDao {
    @Transaction
    @Query("SELECT * FROM conversations ORDER BY last_activity DESC")
    suspend fun getConversationsWithMessages(): List<ConversationWithMessages>
    // Room generates efficient JOIN queries internally
}

// βœ… Custom JOIN for specific needs
@Dao
interface OptimizedDao {
    @Query("""
        SELECT c.*, m.content AS last_message_content, m.timestamp AS last_message_time
        FROM conversations c
        LEFT JOIN messages m ON m.id = (
            SELECT id FROM messages
            WHERE conversation_id = c.id
            ORDER BY timestamp DESC
            LIMIT 1
        )
        ORDER BY COALESCE(m.timestamp, c.created_at) DESC
    """)
    suspend fun getConversationsWithLastMessage(): List<ConversationPreview>
}

Query optimization guidelines: avoid LIKE '%text%' (can’t use index β€” use Full-Text Search instead), prefer IN clauses over multiple OR conditions (optimizer handles IN better), avoid functions on indexed columns in WHERE clauses (WHERE LOWER(name) = 'john' can’t use an index on name), and use LIMIT to avoid loading more data than needed.

// Full-Text Search for text queries (instead of LIKE '%query%')
@Entity(tableName = "messages_fts")
@Fts4(contentEntity = Message::class)
data class MessageFts(
    val content: String
)

@Dao
interface SearchDao {
    // ❌ LIKE β€” can't use index, scans entire table
    @Query("SELECT * FROM messages WHERE content LIKE '%' || :query || '%'")
    suspend fun searchSlow(query: String): List<Message>

    // βœ… FTS β€” uses inverted index, dramatically faster
    @Query("""
        SELECT messages.* FROM messages
        JOIN messages_fts ON messages.rowid = messages_fts.rowid
        WHERE messages_fts MATCH :query
    """)
    suspend fun searchFast(query: String): List<Message>
}

Key takeaway: Use EXPLAIN QUERY PLAN to verify your queries use indices. Eliminate N+1 queries with JOINs or Room’s @Relation. Use Full-Text Search instead of LIKE '%query%' for text search. Avoid applying functions to indexed columns in WHERE clauses.

Lesson 10.5: Avoiding Autoboxing and Hidden Costs

Autoboxing β€” the automatic conversion between primitive types (int, long, boolean) and their wrapper objects (Integer, Long, Boolean) β€” is one of the most insidious performance costs in Kotlin on the JVM. Each boxing operation creates an object on the heap. In hot paths, this generates thousands of unnecessary objects per second, increasing GC pressure.

Kotlin hides boxing well. When you write val x: Int = 42, Kotlin uses the primitive int on the JVM β€” no boxing. But when you write val x: Int? = 42, Kotlin must use Integer because JVM primitives can’t be null. Generic types always use boxed types because of JVM type erasure. List<Int> is actually List<Integer> on the JVM β€” every integer in the list is boxed.

// Autoboxing traps in Kotlin

// βœ… No boxing β€” uses JVM primitive int
val count: Int = 42

// ❌ Boxing β€” nullable requires wrapper object
val maybeCount: Int? = 42 // Becomes Integer on JVM

// ❌ Boxing β€” generic type parameter requires wrapper
val numbers: List<Int> = listOf(1, 2, 3) // List<Integer> on JVM

// βœ… No boxing β€” primitive array
val array: IntArray = intArrayOf(1, 2, 3) // int[] on JVM

// ❌ Boxing β€” typed array
val boxedArray: Array<Int> = arrayOf(1, 2, 3) // Integer[] on JVM

// Performance comparison β€” summing 1 million numbers
fun sumWithBoxing(): Long {
    val numbers: List<Int> = (1..1_000_000).toList() // 1M Integer objects
    return numbers.fold(0L) { acc, n -> acc + n }    // Unboxing on every access
}

fun sumWithoutBoxing(): Long {
    val numbers: IntArray = IntArray(1_000_000) { it + 1 } // No boxing
    var sum = 0L
    for (n in numbers) sum += n  // Direct primitive access
    return sum
}
// sumWithoutBoxing is ~5x faster and uses ~16MB less memory

// Use mutableIntStateOf instead of mutableStateOf<Int> in Compose
// ❌ Boxing β€” mutableStateOf stores Any internally
val count = mutableStateOf(0) // boxes Int to Integer

// βœ… No boxing β€” specialized for Int
val count = mutableIntStateOf(0) // stores int primitive
// Also: mutableLongStateOf, mutableFloatStateOf, mutableDoubleStateOf

Lambda captures have hidden costs. When a lambda captures a primitive variable, the variable is boxed into a Ref object so it can be shared between the lambda and its enclosing scope. This is invisible in the source code but generates allocations. Inline functions avoid this because the lambda body is inlined into the caller β€” there’s no separate scope to share the variable with.

// ❌ Lambda captures primitive β€” boxes into Ref object
fun countOccurrences(text: String, char: Char): Int {
    var count = 0  // Captured by lambda β€” boxed into IntRef
    text.forEach { c ->
        if (c == char) count++ // Accessing IntRef.element
    }
    return count
}

// βœ… forEach is inline β€” no lambda allocation, no boxing
// The same code is actually fine because forEach IS inline in stdlib
// But be careful with non-inline higher-order functions

// ❌ Non-inline function β€” lambda allocates, primitives box
fun processItems(items: List<Item>, action: (Item) -> Unit) {
    // 'action' is a Function1 object allocated at the call site
    items.forEach(action)
}

// βœ… Inline function β€” no allocation
inline fun processItems(items: List<Item>, action: (Item) -> Unit) {
    items.forEach(action)
}

String operations in loops are another common hidden cost. String concatenation with + or string templates creates a new String object every time. In a loop processing thousands of items, use StringBuilder (or Kotlin’s buildString) to avoid O(nΒ²) string allocation.

Key takeaway: Avoid nullable primitives and generic collections for primitives in hot paths β€” use IntArray, SparseArray, and specialized state holders (mutableIntStateOf). Understand that inline eliminates both lambda allocation and primitive boxing for captured variables. Use buildString for string construction in loops.

Lesson 10.6: Performance Testing Strategy

Performance optimization without measurement is guesswork. This lesson ties together all the profiling and measurement techniques from the course into a coherent testing strategy that catches regressions, validates optimizations, and maintains performance over time.

Your performance testing strategy should have three layers. Local development testing uses the Android Studio Profiler, Layout Inspector, and StrictMode to catch problems as you write code. CI/CD testing uses Macrobenchmark and Microbenchmark to catch regressions automatically. Production monitoring uses Firebase Performance, Android Vitals, and custom metrics to track real-world performance across your user base.

// Performance test suite structure
// test/benchmark/
//   StartupBenchmark.kt      β€” cold start, warm start timing
//   ScrollBenchmark.kt       β€” frame timing during scroll
//   NavigationBenchmark.kt   β€” transition timing between screens
//   DatabaseBenchmark.kt     β€” query performance
//   SerializationBenchmark.kt β€” parsing performance

// CI integration β€” compare against baseline
class PerformanceGate {
    data class Threshold(
        val metric: String,
        val maxValue: Double,
        val unit: String
    )

    val thresholds = listOf(
        Threshold("cold_start_p50", 500.0, "ms"),
        Threshold("cold_start_p95", 800.0, "ms"),
        Threshold("scroll_jank_percent", 5.0, "%"),
        Threshold("frame_p95", 16.0, "ms"),
        Threshold("apk_size", 20.0, "MB")
    )

    fun evaluate(results: Map<String, Double>): GateResult {
        val violations = thresholds.mapNotNull { threshold ->
            val actual = results[threshold.metric]
            if (actual != null && actual > threshold.maxValue) {
                "${threshold.metric}: ${actual}${threshold.unit} " +
                    "(max: ${threshold.maxValue}${threshold.unit})"
            } else null
        }

        return GateResult(
            passed = violations.isEmpty(),
            violations = violations
        )
    }

    data class GateResult(
        val passed: Boolean,
        val violations: List<String>
    )
}

// Production monitoring with custom metrics
class PerformanceMonitor(private val analytics: Analytics) {

    fun trackScreenLoad(
        screenName: String,
        loadTimeMs: Long,
        source: String
    ) {
        analytics.track("screen_load", mapOf(
            "screen" to screenName,
            "load_time_ms" to loadTimeMs,
            "source" to source,
            "device_model" to Build.MODEL,
            "api_level" to Build.VERSION.SDK_INT,
            "memory_class" to getMemoryClass()
        ))
    }

    fun trackFrameMetrics(
        screenName: String,
        totalFrames: Int,
        jankyFrames: Int,
        p95FrameTimeMs: Double
    ) {
        analytics.track("frame_metrics", mapOf(
            "screen" to screenName,
            "total_frames" to totalFrames,
            "janky_frames" to jankyFrames,
            "jank_percent" to (jankyFrames * 100.0 / totalFrames),
            "p95_frame_ms" to p95FrameTimeMs
        ))
    }

    private fun getMemoryClass(): Int {
        val activityManager = context.getSystemService(
            Context.ACTIVITY_SERVICE
        ) as ActivityManager
        return activityManager.memoryClass
    }
}

The optimization cycle follows a consistent pattern: measure (profile the current state), identify (find the bottleneck using data), hypothesize (propose a specific optimization), implement (make the smallest possible change), verify (measure again to confirm improvement), and monitor (track in production for regressions). Never skip the measure step β€” intuition about performance is wrong more often than it’s right.

Set performance budgets at the beginning of a project and enforce them throughout development. It’s 10x harder to fix a performance regression that accumulated over months than to prevent it in the first place. Automated benchmarks in CI are your insurance policy β€” they catch regressions the day they’re introduced, when the change is fresh in the author’s mind and easy to fix.

// Complete performance audit checklist
object PerformanceAudit {
    val checklist = listOf(
        "Cold start time < 500ms on mid-range device",
        "P95 frame time < 16ms during scrolling",
        "Memory usage stays flat during sustained use (no leaks)",
        "APK size within budget",
        "No StrictMode violations in release build",
        "Baseline Profiles generated and shipped",
        "R8 enabled with resource shrinking",
        "Database queries use indices (verified with EXPLAIN)",
        "Network responses cached appropriately",
        "Images loaded at display size, not full resolution",
        "No allocations in onDraw() or hot paths",
        "WorkManager used for deferrable background work",
        "Location requests use appropriate priority",
        "Compose recomposition counts reasonable",
        "No unstable parameters in hot-path composables"
    )
}

Key takeaway: Performance optimization follows a cycle: measure, identify, hypothesize, implement, verify, monitor. Automate benchmarks in CI to catch regressions immediately. Monitor production metrics across your real user base β€” lab testing on flagship devices doesn’t represent real-world performance.

Quiz: Micro-Optimizations and Database

What is the performance benefit of inline functions?

  • ❌ They run on a faster thread
  • ❌ They use less memory for the function itself
  • βœ… They eliminate lambda allocation and function call overhead at the call site
  • ❌ They compile to native code

inline copies the function body and lambda body to the call site, eliminating the Function object allocation for lambda parameters and the virtual method call overhead. This is significant in hot paths where the function is called thousands of times per second.

Why are SparseArray variants more efficient than HashMap for integer keys?

  • ❌ They use hash tables internally
  • βœ… They avoid boxing integers and use sorted arrays with binary search
  • ❌ They limit the number of entries
  • ❌ They use native code

SparseArray stores keys as a primitive int[] array (no boxing) and values as an Object[] array, using binary search for lookups. HashMap<Integer, V> boxes every integer key into an Integer object and creates a Node wrapper for each entry, using significantly more memory.

What makes database transactions critical for batch operations?

  • ❌ They provide read consistency
  • ❌ They enable parallel queries
  • βœ… They reduce disk commits from N operations to 1, making batches 100-1000x faster
  • ❌ They increase the database size limit

Without an explicit transaction, each insert/update/delete commits to disk independently. Wrapping 1000 operations in a single transaction batches all changes into one disk write at the end, avoiding 999 unnecessary disk syncs.

What does EXPLAIN QUERY PLAN reveal?

  • ❌ The SQL syntax tree
  • ❌ The estimated execution time
  • βœ… Whether the query uses indices or performs table scans
  • ❌ The amount of memory the query uses

EXPLAIN QUERY PLAN shows SQLite’s execution strategy β€” whether it uses an index (SEARCH USING INDEX), performs a full table scan (SCAN TABLE), or needs temporary storage for sorting (USE TEMP B-TREE). This tells you exactly where to add indices for optimization.

When should you use mutableIntStateOf instead of mutableStateOf in Compose?

  • ❌ When you need thread safety
  • ❌ When the value is nullable
  • βœ… Always for primitive Int state β€” it avoids boxing overhead
  • ❌ When the value changes frequently

mutableStateOf<Int> stores the value as Any internally, requiring boxing every integer into an Integer object. mutableIntStateOf uses a primitive int field, avoiding boxing entirely. The same optimization exists for Long, Float, and Double.

Coding Challenge: Performance-Optimized Data Pipeline

Build a data processing pipeline that applies all the micro-optimizations from this module: object pooling, primitive arrays, efficient data structures, and batch database operations.

// Challenge: Build a SensorDataPipeline that:
// 1. Reads sensor data from a high-frequency source (100 Hz)
// 2. Buffers readings using pre-allocated arrays (no boxing)
// 3. Processes batches using object pooling
// 4. Stores results in Room with batch transactions
// 5. Reports processing statistics without allocations in the hot path

Solution

class SensorDataPipeline(
    private val database: AppDatabase,
    private val scope: CoroutineScope,
    private val bufferSize: Int = 100,
    private val flushIntervalMs: Long = 1000
) {
    // Pre-allocated buffers β€” no boxing
    private var timestampBuffer = LongArray(bufferSize)
    private var xBuffer = FloatArray(bufferSize)
    private var yBuffer = FloatArray(bufferSize)
    private var zBuffer = FloatArray(bufferSize)
    private var bufferIndex = 0

    // Object pool for database entities
    private val entityPool = ArrayDeque<SensorReading>(bufferSize)

    // Statistics β€” primitives only, no allocations
    private var totalReadings = 0L
    private var totalBatches = 0L
    private var totalProcessingNs = 0L
    private var maxBatchProcessingNs = 0L

    init {
        // Pre-populate entity pool
        repeat(bufferSize) {
            entityPool.addLast(SensorReading("", 0L, 0f, 0f, 0f, 0f))
        }

        // Start periodic flush
        scope.launch {
            while (isActive) {
                delay(flushIntervalMs)
                flushBuffer()
            }
        }
    }

    // Hot path β€” called 100 times per second
    // Zero allocations in this method
    fun onSensorReading(timestamp: Long, x: Float, y: Float, z: Float) {
        if (bufferIndex >= bufferSize) {
            // Buffer full β€” flush synchronously or drop
            scope.launch { flushBuffer() }
            return
        }

        timestampBuffer[bufferIndex] = timestamp
        xBuffer[bufferIndex] = x
        yBuffer[bufferIndex] = y
        zBuffer[bufferIndex] = z
        bufferIndex++
        totalReadings++
    }

    private suspend fun flushBuffer() {
        if (bufferIndex == 0) return

        val count = bufferIndex
        val startNs = System.nanoTime()

        // Copy buffer contents and reset (allows new readings during processing)
        val timestamps = timestampBuffer.copyOf(count)
        val xs = xBuffer.copyOf(count)
        val ys = yBuffer.copyOf(count)
        val zs = zBuffer.copyOf(count)
        bufferIndex = 0

        withContext(Dispatchers.Default) {
            // Process batch β€” compute magnitude, detect anomalies
            val entities = mutableListOf<SensorReading>()

            for (i in 0 until count) {
                val magnitude = kotlin.math.sqrt(
                    xs[i] * xs[i] + ys[i] * ys[i] + zs[i] * zs[i]
                )

                // Reuse from pool or create new
                val entity = if (entityPool.isNotEmpty()) {
                    entityPool.removeFirst().copy(
                        id = "${timestamps[i]}_$i",
                        timestamp = timestamps[i],
                        x = xs[i], y = ys[i], z = zs[i],
                        magnitude = magnitude
                    )
                } else {
                    SensorReading(
                        id = "${timestamps[i]}_$i",
                        timestamp = timestamps[i],
                        x = xs[i], y = ys[i], z = zs[i],
                        magnitude = magnitude
                    )
                }
                entities.add(entity)
            }

            // Batch insert in single transaction
            withContext(Dispatchers.IO) {
                database.withTransaction {
                    database.sensorDao().insertAll(entities)
                }
            }

            // Return entities to pool
            entities.forEach { entity ->
                if (entityPool.size < bufferSize) {
                    entityPool.addLast(entity)
                }
            }
        }

        val durationNs = System.nanoTime() - startNs
        totalBatches++
        totalProcessingNs += durationNs
        if (durationNs > maxBatchProcessingNs) {
            maxBatchProcessingNs = durationNs
        }
    }

    // Statistics β€” no allocations for primitives
    fun getStats(): PipelineStats = PipelineStats(
        totalReadings = totalReadings,
        totalBatches = totalBatches,
        avgBatchProcessingMs = if (totalBatches > 0)
            (totalProcessingNs / totalBatches) / 1_000_000.0 else 0.0,
        maxBatchProcessingMs = maxBatchProcessingNs / 1_000_000.0,
        currentBufferUsage = bufferIndex,
        poolSize = entityPool.size
    )

    data class PipelineStats(
        val totalReadings: Long,
        val totalBatches: Long,
        val avgBatchProcessingMs: Double,
        val maxBatchProcessingMs: Double,
        val currentBufferUsage: Int,
        val poolSize: Int
    )
}

@Entity(tableName = "sensor_readings")
data class SensorReading(
    @PrimaryKey val id: String,
    val timestamp: Long,
    val x: Float,
    val y: Float,
    val z: Float,
    val magnitude: Float
)

@Dao
interface SensorDao {
    @Insert(onConflict = OnConflictStrategy.REPLACE)
    suspend fun insertAll(readings: List<SensorReading>)
}