13 February 2026
This covers how memory works on Android, why frames drop, and how to find and fix performance problems.
A memory leak happens when an object is no longer needed but something still holds a reference to it, so the garbage collector can’t reclaim it. Activities are the most dangerous ones to leak because they hold references to the entire view hierarchy, bitmaps, and resources.
Common causes:
Handler callback or Runnable) holding an implicit reference to the outer ActivityViewModel holding a reference to a View or Activity// Classic leak: anonymous Runnable holds reference to Activity
class LeakyActivity : AppCompatActivity() {
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
val handler = Handler(Looper.getMainLooper())
// This Runnable is an anonymous inner class that holds
// an implicit reference to LeakyActivity
handler.postDelayed({
updateUI() // 'this' reference to Activity lives for 30 seconds
}, 30_000)
}
}
// Fix: use a WeakReference or cancel the callback in onDestroy
class FixedActivity : AppCompatActivity() {
private val handler = Handler(Looper.getMainLooper())
private val updateRunnable = Runnable { updateUI() }
override fun onDestroy() {
super.onDestroy()
handler.removeCallbacks(updateRunnable)
}
}
Android uses ART’s garbage collector which is generational, concurrent, and moving. Objects are categorized by age — young objects sit in a nursery, and ones that survive multiple GC cycles get promoted to the old generation. Most objects are short-lived, so collecting the young generation frequently is efficient. The concurrent part means GC runs alongside app threads with pause times typically under 1ms. Moving means the GC can relocate objects to reduce fragmentation.
GC roots include local variables on the call stack, static fields, active threads, and JNI references. Any object not reachable from a GC root gets collected. This is why a leaked Activity is a problem — if anything reachable holds a reference to it, the GC can’t touch it.
LeakCanary is Square’s memory leak detection library. When an Activity or Fragment is destroyed, LeakCanary creates a WeakReference to it and checks if that reference gets enqueued after a GC cycle. If it’s not enqueued, the object wasn’t collected — it’s leaked. LeakCanary then triggers a heap dump and analyzes the reference chain from the leaked object back to the GC root, showing exactly which reference is keeping it alive.
// build.gradle.kts — that's literally all you need
dependencies {
debugImplementation("com.squareup.leakcanary:leakcanary-android:2.14")
}
It automatically watches Activities, Fragments, Fragment Views, ViewModels, and Services after they’re destroyed. You can also watch custom objects by calling AppWatcher.objectWatcher.expectWeaklyReachable().
Android updates views every 16ms (60 FPS) to render. When this takes more than 16ms, the frame is dropped and the UI lags. On 90Hz or 120Hz displays, the budget is even tighter — 11ms and 8.3ms respectively.
Each frame goes through three phases: measure/layout (compute sizes and positions), draw (generate display list commands), and RenderThread compositing on the GPU. If main thread work pushes past the frame deadline, the frame misses VSYNC and gets displayed late. Tools like Android Studio’s CPU Profiler and Perfetto help visualize where each frame’s time is spent.
Overdraw happens when the same pixel is drawn multiple times in a single frame. For example, if you have a background on your Activity, a FrameLayout, and a CardView, that pixel area is drawn three times even though only the top layer is visible. Minor overdraw (2x) is normal, but 4x+ on large areas hurts performance.
Enable “Debug GPU Overdraw” in Developer Options to see it. It color-codes the screen:
The fix is usually removing unnecessary backgrounds. Remove android:background from your Activity theme’s window and only set backgrounds where needed. The <merge> tag also helps by eliminating wrapper layout layers.
Large bitmaps are one of the most common causes of OutOfMemoryError. A 12-megapixel photo (4000x3000) at ARGB_8888 takes 48MB of memory. Loading 3-4 of those can crash the app. The solution is subsampling — load the bitmap at a reduced resolution using BitmapFactory.Options.inSampleSize.
fun decodeSampledBitmap(
resources: Resources,
resId: Int,
targetWidth: Int,
targetHeight: Int
): Bitmap {
// First, decode bounds only (no memory allocation)
val options = BitmapFactory.Options().apply {
inJustDecodeBounds = true
}
BitmapFactory.decodeResource(resources, resId, options)
// Calculate the sample size
options.inSampleSize = calculateInSampleSize(
options.outWidth, options.outHeight,
targetWidth, targetHeight
)
// Decode with the sample size
options.inJustDecodeBounds = false
return BitmapFactory.decodeResource(resources, resId, options)
}
fun calculateInSampleSize(
rawWidth: Int, rawHeight: Int,
targetWidth: Int, targetHeight: Int
): Int {
var sampleSize = 1
if (rawHeight > targetHeight || rawWidth > targetWidth) {
val halfHeight = rawHeight / 2
val halfWidth = rawWidth / 2
while (halfHeight / sampleSize >= targetHeight &&
halfWidth / sampleSize >= targetWidth) {
sampleSize *= 2
}
}
return sampleSize
}
In practice, image loading libraries like Coil and Glide handle subsampling, caching (memory + disk), lifecycle awareness, and request cancellation automatically.
LruCache is a fixed-size cache that evicts the least recently accessed entry when it’s full. You define a max size, and when a new entry would exceed the limit, the cache auto-clears the oldest entry. It’s one of the most easy and optimised solutions for in-memory caching.
val maxMemory = (Runtime.getRuntime().maxMemory() / 1024).toInt()
val cacheSize = maxMemory / 8 // Use 1/8th of available memory
val bitmapCache = object : LruCache<String, Bitmap>(cacheSize) {
override fun sizeOf(key: String, bitmap: Bitmap): Int {
// Size in kilobytes
return bitmap.byteCount / 1024
}
}
// Usage
bitmapCache.put("profile_photo", bitmap)
val cached: Bitmap? = bitmapCache.get("profile_photo")
A common two-tier strategy uses LruCache for fast in-memory access plus DiskLruCache for persistent disk cache. Image loading libraries do exactly this — they check memory cache first (instant), then disk cache (fast), then network (slow).
Dalvik used JIT (Just-In-Time) compilation — bytecode was interpreted at runtime, and hot methods were compiled to native code on the fly. This meant faster installs but slower app startup. ART, introduced in Android 5.0, switched to AOT (Ahead-Of-Time) compilation — apps were fully compiled to native code during installation. Apps launched faster, but install times and storage usage increased.
From Android 7.0, ART uses a hybrid approach called profile-guided compilation. The app initially runs with an interpreter and JIT compiler. ART profiles which methods are “hot” and during idle charging, a background daemon AOT-compiles those methods. Over time the app gets faster as more critical paths are compiled.
Baseline profiles solve the first-run problem. Even with profile-guided compilation, the first several launches are slower because no profile exists yet. Baseline profiles let you ship a pre-built profile with your APK or AAB that tells ART which methods to AOT-compile during installation. This way the first launch is as fast as the hundredth.
They improve code execution speed by about 30% from first launch. You generate them using the Macrobenchmark library by writing tests that exercise critical user journeys.
// Baseline profile generator using Macrobenchmark
@RunWith(AndroidJUnit4::class)
class BaselineProfileGenerator {
@get:Rule
val rule = BaselineProfileRule()
@Test
fun generateBaselineProfile() {
rule.collect(packageName = "com.example.app") {
// Start the app
pressHome()
startActivityAndWait()
// Navigate through critical user journeys
device.findObject(By.text("Feed")).click()
device.waitForIdle()
device.findObject(By.text("Profile")).click()
device.waitForIdle()
}
}
}
The generated profile is included in your AAB, and Google Play distributes it with cloud profiles to devices. On devices without Play Store, the profile is bundled in the APK itself.
Android has three startup types:
To measure cold start, call reportFullyDrawn() on your Activity when the first meaningful content is visible. Logcat shows the timing in the Displayed log line.
Common optimizations:
App Startup with deferred initializationdex file size for faster class loadingApplication.onCreate() and Activity.onCreate()R8 is Google’s replacement for ProGuard and the default since Android Gradle Plugin 3.4+. It does four things:
shrinkResources// build.gradle.kts
android {
buildTypes {
release {
isMinifyEnabled = true
isShrinkResources = true
proguardFiles(
getDefaultProguardFile("proguard-android-optimize.txt"),
"proguard-rules.pro"
)
}
}
}
R8 can’t see into reflection, JSON serialization, or certain framework callbacks. You need to write keep rules for classes accessed reflectively. Getting this wrong causes runtime crashes in production that don’t appear in debug builds.
Macrobenchmark measures real user-facing metrics — startup time, frame timing during scrolling, animation smoothness. It launches your app in a separate process and measures from the outside.
Microbenchmark measures execution time of individual code blocks — a function call, a serialization operation. It runs in-process with JIT warmup for stable measurements.
// Macrobenchmark: measuring startup time
@RunWith(AndroidJUnit4::class)
class StartupBenchmark {
@get:Rule
val benchmarkRule = MacrobenchmarkRule()
@Test
fun startupColdCompilation() {
benchmarkRule.measureRepeated(
packageName = "com.example.app",
metrics = listOf(StartupTimingMetric()),
compilationMode = CompilationMode.None(),
iterations = 5,
startupMode = StartupMode.COLD
) {
pressHome()
startActivityAndWait()
}
}
}
You need both. Macrobenchmark tells you “startup takes 800ms” but not why. Microbenchmark tells you “this JSON parsing takes 50ms” but not whether it matters in the real user journey. Use Macrobenchmark to find slow areas, then Microbenchmark to optimize the specific bottleneck.
onTrimMemory() work and what should you do with each level?The system calls onTrimMemory() on your Application, Activity, Service, and ContentProvider when it needs to reclaim memory. The level parameter tells you how critical the situation is:
TRIM_MEMORY_RUNNING_LOW / TRIM_MEMORY_RUNNING_CRITICAL — app is in foreground but system is low on memory, release non-essential cachesTRIM_MEMORY_UI_HIDDEN — user navigated away, release UI-related resourcesTRIM_MEMORY_BACKGROUND / TRIM_MEMORY_MODERATE / TRIM_MEMORY_COMPLETE — app is in background and increasingly likely to be killed, release as much as possibleThe most actionable level is TRIM_MEMORY_UI_HIDDEN — clear your image memory cache, drop preloaded data, release large objects. Libraries like Coil and Glide handle this automatically for their caches. Custom caches should do the same.
Record a trace with Android Studio Profiler or Perfetto while scrolling. Each frame that takes more than 16ms shows up in red. Click a dropped frame to see what the main thread was doing.
Common causes and fixes:
onBindViewHolder() doing too much work — use async image loading, avoid creating new objects in onBindnotifyDataSetChanged() — use DiffUtil or AsyncListDiffer insteadRecycledViewPoolConstraintLayout or migrate to ComposeonDraw() methods — cache drawing computations, avoid allocationssetHasFixedSize(true) when RecyclerView size doesn’t changerequestLayout() calls during scroll triggering unnecessary layout passesOutOfMemoryError and how do you prevent them?Common sources:
Prevention:
onTrimMemory() to release caches proactivelylargeHeap=true only as a last resortYou can request GC using System.gc() or Runtime.getRuntime().gc(), but it cannot be forced. The system treats it as a suggestion. ART’s GC decides when and how to collect based on memory pressure, allocation rates, and its own heuristics. Calling System.gc() in production is almost always wrong — it can trigger a full GC pause that hurts performance. The one legitimate use case is in test or benchmarking code where you want a clean memory state.
StrictMode detects things like disk reads or network calls on the main thread. It has two policies:
// Enable in Application.onCreate() for debug builds only
if (BuildConfig.DEBUG) {
StrictMode.setThreadPolicy(
StrictMode.ThreadPolicy.Builder()
.detectDiskReads()
.detectDiskWrites()
.detectNetwork()
.penaltyLog() // Log to Logcat
.penaltyFlashScreen() // Flash the screen red
.build()
)
StrictMode.setVmPolicy(
StrictMode.VmPolicy.Builder()
.detectLeakedSqlLiteObjects()
.detectLeakedClosableObjects()
.detectActivityLeaks()
.penaltyLog()
.build()
)
}
Never enable StrictMode in release builds. In development, it catches things like SharedPreferences.commit() on the main thread, checking File.exists() on the main thread, or forgetting to close a Cursor. Many production ANRs can be prevented by catching these issues early with StrictMode.
Bitmap.Config.ARGB_8888 and RGB_565?ARGB_8888 uses 4 bytes per pixel — full transparency support and 16.7 million colors. A 1000x1000 bitmap takes 4MB. RGB_565 uses 2 bytes per pixel — no alpha channel, only 65,536 colors, but exactly half the memory.
ARGB_8888 is the default and right for most images — photos, complex graphics, anything with transparency. RGB_565 is useful for images without transparency when memory is tight, like thumbnails in a long list. The visual difference is usually not noticeable for photos but can cause visible banding in gradients.
Bitmap.Config.ARGB_8888 and RGB_565?StrictMode detect, and should you use it in production?invalidate() and requestLayout()?inBitmap and how does bitmap pooling work?