Offline-first means the app works without a network as its default mode. This is one of the most practical system design topics because every mobile app deals with flaky connectivity.
There are levels to offline support. A basic cache means you show stale data when the network is gone — read-only, no writes. Offline-capable means you handle disconnection gracefully, maybe queue a few actions. Offline-first means the local database is the source of truth. Every read hits the local DB. Every write goes to the local DB first. The network is just a sync mechanism that runs in the background.
The architecture is fundamentally different. In an online-first app, the network call is the primary path and the cache is a fallback. In an offline-first app, the local DB is the primary path and the server is where data eventually converges.
The user should be able to perform full CRUD while offline — create, read, update, and delete data without any network connection. All changes are persisted locally and queued for sync. When the device reconnects, the app pushes pending local changes to the server and pulls remote changes. The user shouldn’t need to trigger sync manually.
The app also needs to handle conflicts — what happens when the same record was modified both locally and on the server while offline.
The local database (Room) is the single source of truth. The UI layer observes the local DB via Flow. A sync engine sits between the local DB and the remote API. It has two jobs — push local changes to the server, and pull remote changes into the local DB. A pending operations queue stores all local writes. A WorkManager job triggers sync whenever the network is available.
The data flow is: UI writes to local DB, local DB notifies UI via Flow, sync engine pushes to server in the background. For reads, it’s: UI reads from local DB, sync engine pulls remote changes into local DB, UI gets notified automatically.
Reads from Room take microseconds. Network calls take hundreds of milliseconds at best. With the local DB as source of truth, the UI is always responsive regardless of network state. The user can browse, create, and edit data in a subway with zero connectivity.
It also simplifies the data flow. The UI observes one source — Room via Flow. The repository handles syncing Room with the server separately. You don’t need to merge local and remote data at the UI layer.
Every entity needs sync metadata alongside its business fields. Add a version or updatedAt timestamp for conflict detection, a syncStatus flag to track whether the record is synced, pending, or failed, and a lastSyncedVersion to know the state at the time of the last successful sync.
@Entity(tableName = "notes")
data class NoteEntity(
@PrimaryKey val id: String,
val title: String,
val body: String,
val updatedAt: Long,
val version: Int,
val syncStatus: String, // "SYNCED", "PENDING", "FAILED"
val lastSyncedVersion: Int
)
The syncStatus flag lets the UI show sync state per item. The version and lastSyncedVersion together tell you if there are local changes, remote changes, or a conflict during sync.
Create a separate Room entity that records every local write as an operation. Each entry stores the operation type (create, update, delete), the entity type, the entity ID, the serialized payload, a timestamp, and a retry count. Process the queue in FIFO order to maintain causality — a create must sync before an update to the same entity.
@Entity(tableName = "pending_operations")
data class PendingOperation(
@PrimaryKey(autoGenerate = true) val id: Long = 0,
val operationType: String,
val entityType: String,
val entityId: String,
val payload: String,
val createdAt: Long = System.currentTimeMillis(),
val retryCount: Int = 0
)
Every user action — even when online — goes through this queue. This keeps the data flow consistent regardless of network state.
Full sync downloads everything on every cycle, which wastes bandwidth. Delta sync only transfers what changed. The server accepts a syncToken (or since timestamp) and returns only records created, updated, or deleted after that point, along with a new sync token.
The client stores the last sync token locally. On each sync cycle, it sends the token and receives only the delta. It applies the delta to Room — insert new records, update modified ones, soft-delete removed ones. If the token is too old and the server has purged its change log, fall back to a full sync. The server response should include a flag like fullSyncRequired: true for this case.
Compare the local version with the server version relative to the last synced version. If only the local version is ahead of lastSyncedVersion, push local changes. If only the server version is ahead, accept the server data. If both are ahead, you have a conflict.
For timestamp-based detection, the logic is similar — compare localUpdatedAt and serverUpdatedAt against lastSyncedAt. The version-based approach is more reliable because timestamps depend on device clocks, which can drift. Server-assigned version numbers are monotonic and don’t have the clock skew problem.
Most apps use LWW for most entities and merge or manual resolution for a few critical ones.
Track which fields changed since the last sync using per-field dirty flags. During sync, if the client changed the title and the server changed the body, take the client’s title and the server’s body — no conflict. If both changed the same field, fall back to LWW or manual resolution for just that field.
data class NoteSyncState(
val id: String,
val title: String,
val body: String,
val titleDirty: Boolean = false,
val bodyDirty: Boolean = false,
val version: Int
)
fun mergeNote(local: NoteSyncState, server: NoteSyncState): NoteSyncState {
return NoteSyncState(
id = local.id,
title = if (local.titleDirty && !server.titleDirty) local.title
else server.title,
body = if (local.bodyDirty && !server.bodyDirty) local.body
else server.body,
version = maxOf(local.version, server.version) + 1
)
}
This approach reduces visible conflicts dramatically. In a note-taking app, one device changing the title and another changing the body is common — without field-level tracking, that’s a false conflict.
Before processing the queue, scan for multiple operations on the same entity and collapse them. CREATE followed by DELETE cancels out — remove both. CREATE followed by UPDATE becomes a single CREATE with the latest data. UPDATE followed by DELETE becomes just DELETE. Multiple UPDATEs collapse into one UPDATE with the final state.
This reduces network traffic and avoids unnecessary server-side churn. Without compaction, creating and deleting a note offline would result in a create request followed by a delete request — two wasted round trips.
WorkManager is the right tool because it guarantees execution even if the app is killed. It respects Doze mode and battery optimization. Schedule a one-time sync worker with a network constraint — it only runs when connectivity is available.
class SyncWorker(
context: Context,
params: WorkerParameters,
private val syncEngine: SyncEngine
) : CoroutineWorker(context, params) {
override suspend fun doWork(): Result {
return try {
syncEngine.pushPendingOperations()
syncEngine.pullRemoteChanges()
Result.success()
} catch (e: Exception) {
if (runAttemptCount < 3) Result.retry()
else Result.failure()
}
}
}
fun scheduleSyncWork(workManager: WorkManager) {
val request = OneTimeWorkRequestBuilder<SyncWorker>()
.setConstraints(
Constraints.Builder()
.setRequiredNetworkType(NetworkType.CONNECTED)
.build()
)
.setBackoffCriteria(BackoffPolicy.EXPONENTIAL, 30, TimeUnit.SECONDS)
.build()
workManager.enqueueUniqueWork("sync", ExistingWorkPolicy.REPLACE, request)
}
Enqueue sync work whenever the user makes a local change. ExistingWorkPolicy.REPLACE ensures only one sync worker is queued at a time. WorkManager handles retry with exponential backoff automatically when you return Result.retry().
Use ConnectivityManager with a NetworkCallback to observe connectivity changes reactively. When the network comes back, enqueue a sync WorkManager job.
class NetworkMonitor(context: Context) {
private val cm = context.getSystemService(Context.CONNECTIVITY_SERVICE)
as ConnectivityManager
val isOnline: StateFlow<Boolean> = callbackFlow {
val callback = object : ConnectivityManager.NetworkCallback() {
override fun onAvailable(network: Network) { trySend(true) }
override fun onLost(network: Network) { trySend(false) }
}
val request = NetworkRequest.Builder()
.addCapability(NetworkCapabilities.NET_CAPABILITY_INTERNET)
.build()
cm.registerNetworkCallback(request, callback)
awaitClose { cm.unregisterNetworkCallback(callback) }
}.stateIn(CoroutineScope(Dispatchers.Default), SharingStarted.Eagerly, false)
}
One gotcha — NET_CAPABILITY_INTERNET doesn’t guarantee the server is reachable. You can be connected to Wi-Fi where the router has no upstream connection. A robust app should handle failed network requests gracefully rather than trusting the connectivity flag alone.
This is harder than regular migrations because the user might have pending operations in the old schema when the app updates. Room migrations handle the local DB schema change, but the API contract also changes. If the server adds a new required field, old clients syncing without it will fail.
Use API versioning. The client sends its API version with every sync request. The server accepts the old format and applies defaults for missing fields. For breaking changes, support both formats during a transition period. On the client side, the Room migration transforms existing data and pending operations to the new schema.
Apply changes to the local DB immediately and let the UI update via Flow. The user sees the result instantly. In the background, the sync engine pushes the change to the server. If the server rejects it (validation error, conflict, permission denied), revert the local state and show an error.
The rollback needs to be clean. Before applying the optimistic change, snapshot the current state of the record. If the server rejects the change, restore the snapshot and notify the user. For lists, this means the item might briefly appear, then disappear — use animations to make this feel intentional rather than buggy. The vast majority of writes succeed, so the user gets a fast experience.
Test at three levels. Unit test the sync engine — mock the API and DAO, verify conflict resolution, queue compaction, and retry logic. Integration test the full sync flow — use an in-memory Room database and MockWebServer to simulate conflicts, network failures, and partial syncs.
For E2E, simulate offline by toggling airplane mode or using network conditioning tools. The critical scenarios to test: create items offline then sync, edit the same item on two devices, kill the app during sync and verify no data corruption, and the first sync after a long offline period with many pending operations. Flaky conditions (high latency, packet loss) matter as much as full offline.
Large files need chunked uploads. Split the file into fixed-size chunks (e.g., 1MB), upload each chunk separately, and have the server reassemble them. If a chunk fails, retry only that chunk — not the entire file. Track upload progress per chunk in Room so the app can resume after a crash or network loss.
For downloads, use the same chunked approach with Range headers. Store the file locally with a reference in the Room entity. The file and its metadata should sync independently — metadata first, file on demand or when on Wi-Fi. This avoids burning mobile data on large files the user hasn’t opened yet.