Working with Proto DataStore

1. Introduction

What is DataStore?

DataStore is a new and improved data storage solution aimed at replacing SharedPreferences. Built on Kotlin coroutines and Flow, DataStore provides two different implementations: Proto DataStore, which lets you store typed objects (backed by protocol buffers) and Preferences DataStore, which stores key-value pairs. Data is stored asynchronously, consistently, and transactionally, overcoming some of the drawbacks of SharedPreferences.

What you'll learn

  • What DataStore is and why you should use it.
  • How to add DataStore to your project.
  • The differences between Preferences and Proto DataStore and the advantages of each.
  • How to use Proto DataStore.
  • How to migrate from SharedPreferences to Proto DataStore.

What you will build

In this codelab, you're going to start with a sample app that displays a list of tasks that can be filtered by their completed status and can be sorted by priority and deadline.

429d889061f19c94.gif

The boolean flag for the Show completed tasks filter is saved in memory. The sort order is persisted to disk using a SharedPreferences object.

As DataStore has two different implementations: Preferences DataStore and Proto DataStore, you will learn how to use Proto DataStore completing the following tasks in each implementation:

  • Persist the completed status filter in DataStore.
  • Migrate the sort order from SharedPreferences to DataStore.

We recommend working through the Preferences DataStore codelab too, so you better understand the difference between the two.

What you'll need

For an introduction to Architecture Components, check out the Room with a View codelab. For an introduction to Flow, check out the Advanced Coroutines with Kotlin Flow and LiveData codelab.

2. Getting set up

In this step, you will download the code for the entire codelab and then run a simple example app.

To get you started as quickly as possible, we have prepared a starter project for you to build on.

If you have git installed, you can simply run the command below. To check whether git is installed, type git --version in the terminal or command line and verify that it executes correctly.

 git clone https://github.com/android/codelab-android-datastore

The initial state is in the master branch. The solution code is located in the proto_datastore branch.

If you do not have git, you can click the following button to download all of the code for this codelab:

Download source code

  1. Unzip the code, and then open the project in Android Studio Arctic Fox.
  2. Run the app run configuration on a device or emulator.

89af884fa2d4e709.png

The app runs and displays the list of tasks:

16eb4ceb800bf131.png

3. Project overview

The app allows you to see a list of tasks. Each task has the following properties: name, completed status, priority, and deadline.

To simplify the code we need to work with, the app allows you to do only two actions:

  • Toggle Show completed tasks visibility - by default the tasks are hidden
  • Sort the tasks by priority, by deadline or by deadline and priority

The app follows the architecture recommended in the Guide to app architecture. Here's what you will find in each package:

data

  • The Task model class.
  • TasksRepository class - responsible for providing the tasks. For simplicity, it returns hardcoded data and exposes it via a Flow to represent a more realistic scenario.
  • UserPreferencesRepository class - holds the SortOrder, defined as an enum. The current sort order is saved in SharedPreferences as a String, based on the enum value name. It exposes synchronous methods to save and get the sort order.

ui

  • Classes related to displaying an Activity with a RecyclerView.
  • The TasksViewModel class is responsible for the UI logic.

TasksViewModel - holds all the elements necessary to build the data that needs to be displayed in the UI: the list of tasks, the showCompleted and sortOrder flags, wrapped in a TasksUiModel object. Every time one of these values changes, we have to reconstruct a new TasksUiModel. To do this, we combine 3 elements:

  • A Flow<List<Task>> is retrieved from the TasksRepository.
  • A MutableStateFlow<Boolean> holding the latest showCompleted flag which is only kept in memory.
  • A MutableStateFlow<SortOrder> holding the latest sortOrder value.

To ensure that we're updating the UI correctly, only when the Activity is started, we expose a LiveData<TasksUiModel>.

We have a couple of problems with our code:

  • We block the UI thread on disk IO when initializating UserPreferencesRepository.sortOrder. This can result in UI jank.
  • The showCompleted flag is only kept in memory, so this means it will be reset every time the user opens the app. Like the SortOrder, this should be persisted to survive closing the app.
  • We're currently using SharedPreferences to persist data but we keep a MutableStateFlow in memory, that we modify manually, to be able to be notified of changes. This breaks easily if the value is modified somewhere else in the application.
  • In UserPreferencesRepository we expose two methods for updating the sort order: enableSortByDeadline() and enableSortByPriority(). Both of these methods rely on the current sort order value but, if one is called before the other has finished, we would end up with the wrong final value. Even more, these methods can result in UI jank and Strict Mode violations as they're called on the UI thread.

Although both the showCompleted and sortOrder flags are user preferences, currently they're represented as two different objects. So one of our goals will be to unify these two flags under a UserPreferences class.

Let's find out how to use DataStore to help us with these issues.

4. DataStore - the basics

Often you might find yourself needing to store small or simple data sets. For this, in the past, you might have used SharedPreferences but this API also has a series of drawbacks. Jetpack DataStore library aims at addressing those issues, creating a simple, safer and asynchronous API for storing data. It provides 2 different implementations:

  • Preferences DataStore
  • Proto DataStore

Feature

SharedPreferences

PreferencesDataStore

ProtoDataStore

Async API

✅ (only for reading changed values, via listener)

✅ (via Flow and RxJava 2 & 3 Flowable)

✅ (via Flow and RxJava 2 & 3 Flowable)

Synchronous API

✅ (but not safe to call on UI thread)

Safe to call on UI thread

❌(1)

✅ (work is moved to Dispatchers.IO under the hood)

✅ (work is moved to Dispatchers.IO under the hood)

Can signal errors

Safe from runtime exceptions

❌(2)

Has a transactional API with strong consistency guarantees

Handles data migration

Type safety

✅ with Protocol Buffers

(1) SharedPreferences has a synchronous API that can appear safe to call on the UI thread, but which actually does disk I/O operations. Furthermore, apply() blocks the UI thread on fsync(). Pending fsync() calls are triggered every time any service starts or stops, and every time an activity starts or stops anywhere in your application. The UI thread is blocked on pending fsync() calls scheduled by apply(), often becoming a source of ANRs.

(2) SharedPreferences throws parsing errors as runtime exceptions.

Preferences vs Proto DataStore

While both Preferences and Proto DataStore allow saving data, they do this in different ways:

  • Preference DataStore, like SharedPreferences, accesses data based on keys, without defining a schema upfront.
  • Proto DataStore defines the schema using Protocol buffers. Using Protobufs allows persisting strongly typed data. They are faster, smaller, simpler, and less ambiguous than XML and other similar data formats. While Proto DataStore requires you to learn a new serialization mechanism, we believe that the strongly typed advantage brought by Proto DataStore is worth it.

Room vs DataStore

If you have a need for partial updates, referential integrity, or large/complex datasets, you should consider using Room instead of DataStore. DataStore is ideal for small or simple datasets and does not support partial updates or referential integrity.

5. Proto DataStore - overview

One of the downsides of SharedPreferences and Preferences DataStore is that there is no way to define a schema or to ensure that keys are accessed with the correct type. Proto DataStore addresses this problem by using Protocol buffers to define the schema. Using protos DataStore knows what types are stored and will just provide them, removing the need for using keys.

Let's see how to add Proto DataStore and Protobufs to the project, what Protocol buffers are and how to use them with Proto DataStore and how to migrate SharedPreferences to DataStore.

Adding dependencies

To work with Proto DataStore and get Protobuf to generate code for our schema, we'll have to make several changes to your module's build.gradle file:

  • Add the Protobuf plugin
  • Add the Protobuf and Proto DataStore dependencies
  • Configure Protobuf
plugins {
    ...
    id "com.google.protobuf" version "0.8.17"
}

dependencies {
    implementation  "androidx.datastore:datastore:1.0.0"
    implementation  "com.google.protobuf:protobuf-javalite:3.18.0"
    ...
}

protobuf {
    protoc {
        artifact = "com.google.protobuf:protoc:21.7"
    }

    // Generates the java Protobuf-lite code for the Protobufs in this project. See
    // https://github.com/google/protobuf-gradle-plugin#customizing-protobuf-compilation
    // for more information.
    generateProtoTasks {
        all().each { task ->
            task.builtins {
                java {
                    option 'lite'
                }
            }
        }
    }
}

6. Defining and using protobuf objects

Protocol buffers are a mechanism for serializing structured data. You define how you want your data to be structured once and then the compiler generates source code to easily write and read the structured data.

Create the proto file

You define your schema in a proto file. In our codelab we have 2 user preferences: show_completed and sort_order; currently they're represented as two different objects. So one of our goals is to unify these two flags under a UserPreferences class that gets stored in DataStore. Instead of defining this class in Kotlin, we will define it in protobuf schema.

Check out the Proto language guide for in depth info on the syntax. In this codelab we're only going to focus on the types we need.

Create a new file called user_prefs.proto in the app/src/main/proto directory. If you don't see this folder structure, switch to Project view. In protobufs, each structure is defined using a message keyword and each member of the structure is defined inside the message, based on type and name and it gets assigned a 1-based order. Let's define a UserPreferences message that, for now, just has a boolean value called show_completed.

syntax = "proto3";

option java_package = "com.codelab.android.datastore";
option java_multiple_files = true;

message UserPreferences {
  // filter for showing / hiding completed tasks
  bool show_completed = 1;
}

Create the serializer

To tell DataStore how to read and write the data type we defined in the proto file, we need to implement a Serializer. The Serializer defines also the default value to be returned if there's no data on disk. Create a new file called UserPreferencesSerializer in the data package:

object UserPreferencesSerializer : Serializer<UserPreferences> {
    override val defaultValue: UserPreferences = UserPreferences.getDefaultInstance()
    override suspend fun readFrom(input: InputStream): UserPreferences {
        try {
            return UserPreferences.parseFrom(input)
        } catch (exception: InvalidProtocolBufferException) {
            throw CorruptionException("Cannot read proto.", exception)
        }
    }

    override suspend fun writeTo(t: UserPreferences, output: OutputStream) = t.writeTo(output)
}

7. Persisting data in Proto DataStore

Creating the DataStore

The showCompleted flag is kept in memory, in TasksViewModel but it should be stored in UserPreferencesRepository, in a DataStore instance.

To create a DataStore instance we use the dataStore delegate, with the Context as receiver. The delegate has two mandatory parameters:

  • The name of the file that DataStore will act on.
  • The serializer for the type used with DataStore. In our case: UserPreferencesSerializer.

For simplicity, in this codelab, let's do this in TasksActivity.kt:

private const val USER_PREFERENCES_NAME = "user_preferences"
private const val DATA_STORE_FILE_NAME = "user_prefs.pb"
private const val SORT_ORDER_KEY = "sort_order"

private val Context.userPreferencesStore: DataStore<UserPreferences> by dataStore(
    fileName = DATA_STORE_FILE_NAME,
    serializer = UserPreferencesSerializer
)

class TasksActivity: AppCompatActivity() { ... }

The dataStore delegate ensures that we have a single instance of DataStore with that name in our application. Currently, UserPreferencesRepository is implemented as a singleton, because it holds the sortOrderFlow and avoids having it tied to the lifecycle of the TasksActivity. Because UserPreferenceRepository will just work with the data from DataStore and it won't create and hold any new objects, we can already remove the singleton implementation:

  • Remove the companion object
  • Make the constructor public

The UserPreferencesRepository should get a DataStore instance as a constructor parameter. For now, we can leave the Context as a parameter as it's needed by SharedPreferences, but we'll remove it later on.

class UserPreferencesRepository(
    private val userPreferencesStore: DataStore<UserPreferences>,
    context: Context
) { ... }

Let's update the construction of UserPreferencesRepository in TasksActivity and pass in the dataStore:

viewModel = ViewModelProvider(
    this,
    TasksViewModelFactory(
        TasksRepository,
        UserPreferencesRepository(userPreferencesStore, this)
    )
).get(TasksViewModel::class.java)

Reading data from Proto DataStore

Proto DataStore exposes the data stored in a Flow<UserPreferences>. Let's create a public userPreferencesFlow: Flow<UserPreferences> value that gets assigned dataStore.data:

val userPreferencesFlow: Flow<UserPreferences> = userPreferencesStore.data

Handling exceptions while reading data

As DataStore reads data from a file, IOExceptions are thrown when an error occurs while reading data. We can handle these by using the catch Flow transformation and just log the error:

private val TAG: String = "UserPreferencesRepo"

val userPreferencesFlow: Flow<UserPreferences> = userPreferencesStore.data
    .catch { exception ->
        // dataStore.data throws an IOException when an error is encountered when reading data
        if (exception is IOException) {
            Log.e(TAG, "Error reading sort order preferences.", exception)
            emit(UserPreferences.getDefaultInstance())
        } else {
            throw exception
        }
    }

Writing data to Proto DataStore

To write data, DataStore offers a suspending DataStore.updateData() function, where we get as parameter the current state of UserPreferences. To update it, we'll have to transform the preferences object to builder, set the new value and then build the new preferences.

updateData() updates the data transactionally in an atomic read-write-modify operation. The coroutine completes once the data is persisted on disk.

Let's create a suspend function that allows us to update the showCompleted property of UserPreferences, called updateShowCompleted(), that calls dataStore.updateData() and sets the new value:

suspend fun updateShowCompleted(completed: Boolean) {
    userPreferencesStore.updateData { preferences ->
        preferences.toBuilder().setShowCompleted(completed).build()
    }
}

At this point, the app should compile but the functionality we just created in UserPreferencesRepository is not used.

8. SharedPreferences to Proto DataStore

Defining the data to be saved in proto

The sort order is saved in SharedPreferences. Let's move it to DataStore. To do this, let's start by updating UserPreferences in the proto file to also store the sort order. As the SortOrder is an enum we will have to define it in our UserPreference. enums are defined in protobufs similarly to Kotlin.

For enumerations, the default value is the first value listed in the enum's type definition. But, when migrating from SharedPreferences we need to know whether the value we got is the default value or the one previously set in SharedPreferences. To help with this, we define a new value to our SortOrder enum: UNSPECIFIED and list it first, so it can be the default value.

Our user_prefs.proto file should look like this:

syntax = "proto3";

option java_package = "com.codelab.android.datastore";
option java_multiple_files = true;

message UserPreferences {
  // filter for showing / hiding completed tasks
  bool show_completed = 1;

  // defines tasks sorting order: no order, by deadline, by priority, by deadline and priority
  enum SortOrder {
    UNSPECIFIED = 0;
    NONE = 1;
    BY_DEADLINE = 2;
    BY_PRIORITY = 3;
    BY_DEADLINE_AND_PRIORITY = 4;
  }

  // user selected tasks sorting order
  SortOrder sort_order = 2;
}

Clean and rebuild your project to ensure that a new UserPreferences object is generated, containing the new field.

Now that SortOrder is defined in the proto file, we can remove the declaration from UserPreferencesRepository. Delete:

enum class SortOrder {
    NONE,
    BY_DEADLINE,
    BY_PRIORITY,
    BY_DEADLINE_AND_PRIORITY
}

Make sure the right SortOrder import is used everywhere:

import com.codelab.android.datastore.UserPreferences.SortOrder

In the TasksViewModel.filterSortTasks() we're doing different actions based on the SortOrder type. Now that we also added the UNSPECIFIED option, we need to add another case for the when(sortOrder) statement. As we don't want to handle other options than the ones we are right now, we can just throw an UnsupportedOperationException in other cases.

Our filterSortTasks() function looks like this now:

private fun filterSortTasks(
    tasks: List<Task>,
    showCompleted: Boolean,
    sortOrder: SortOrder
): List<Task> {
    // filter the tasks
    val filteredTasks = if (showCompleted) {
        tasks
    } else {
        tasks.filter { !it.completed }
    }
    // sort the tasks
    return when (sortOrder) {
        SortOrder.UNSPECIFIED -> filteredTasks
        SortOrder.NONE -> filteredTasks
        SortOrder.BY_DEADLINE -> filteredTasks.sortedByDescending { it.deadline }
        SortOrder.BY_PRIORITY -> filteredTasks.sortedBy { it.priority }
        SortOrder.BY_DEADLINE_AND_PRIORITY -> filteredTasks.sortedWith(
            compareByDescending<Task> { it.deadline }.thenBy { it.priority }
        )
        // We shouldn't get any other values
        else -> throw UnsupportedOperationException("$sortOrder not supported")
    }
}

Migrating from SharedPreferences

To help with migration, DataStore defines the SharedPreferencesMigration class. The by dataStore method that creates the DataStore (used in TasksActivity), also exposes a produceMigrations parameter. In this block we create the list of DataMigrations that should be run for this DataStore instance. In our case, we have only one migration: the SharedPreferencesMigration.

When implementing a SharedPreferencesMigration, the migrate block gives us two parameters:

  • SharedPreferencesView that allows us to retrieve data from SharedPreferences
  • UserPreferences current data

We will have to return a UserPreferences object.

When implementing the migrate block, we'll have to do the following steps:

  1. Check the sortOrder value in UserPreferences.
  2. If this is SortOrder.UNSPECIFIED it means that we need to retrieve the value from SharedPreferences. If the SortOrder is missing then we can use SortOrder.NONE as default.
  3. Once we get the sort order, we'll have to convert the UserPreferences object to builder, set the sort order and then build the object again by calling build(). No other fields will be affected with this change.
  4. If the sortOrder value in UserPreferences is not SortOrder.UNSPECIFIED we can just return the current data we got in migrate since the migration must have already ran successfully.
private val sharedPrefsMigration = SharedPreferencesMigration(
    context,
    USER_PREFERENCES_NAME
) { sharedPrefs: SharedPreferencesView, currentData: UserPreferences ->
    // Define the mapping from SharedPreferences to UserPreferences
    if (currentData.sortOrder == SortOrder.UNSPECIFIED) {
        currentData.toBuilder().setSortOrder(
            SortOrder.valueOf(
                sharedPrefs.getString(SORT_ORDER_KEY, SortOrder.NONE.name)!!
            )
        ).build()
    } else {
        currentData
    }
}

Now that we defined the migration logic, we need to tell DataStore that it should use it. For this, update the DataStore builder and assign to the migrations parameter a new list that contains an instance of our SharedPreferencesMigration:

private val userPreferencesStore: DataStore<UserPreferences> = context.createDataStore(
    fileName = "user_prefs.pb",
    serializer = UserPreferencesSerializer,
    migrations = listOf(sharedPrefsMigration)
)

Saving the sort order to DataStore

To update the sort order when enableSortByDeadline() and enableSortByPriority() are called, we have to do the following:

  • Call their respective functionalities in dataStore.updateData()'s lambda.
  • As updateData() is a suspend function, enableSortByDeadline() and enableSortByPriority() should also be made a suspend function.
  • Use the current UserPreferences received from updateData() to construct the new sort order
  • Update the UserPreferences by converting it to builder, setting the new sort order and then building the preferences again.

Here's how the enableSortByDeadline() implementation looks like. We'll let you do the changes for enableSortByPriority() by yourself.

suspend fun enableSortByDeadline(enable: Boolean) {
    // updateData handles data transactionally, ensuring that if the sort is updated at the same
    // time from another thread, we won't have conflicts
    dataStore.updateData { preferences ->
        val currentOrder = preferences.sortOrder
        val newSortOrder =
            if (enable) {
                if (currentOrder == SortOrder.BY_PRIORITY) {
                    SortOrder.BY_DEADLINE_AND_PRIORITY
                } else {
                    SortOrder.BY_DEADLINE
                }
            } else {
                if (currentOrder == SortOrder.BY_DEADLINE_AND_PRIORITY) {
                    SortOrder.BY_PRIORITY
                } else {
                    SortOrder.NONE
                }
            }
        preferences.toBuilder().setSortOrder(newSortOrder).build()
    }
}

Now you can remove the context constructor parameter and all the usages of SharedPreferences.

9. Update TasksViewModel to use UserPreferencesRepository

Now that UserPreferencesRepository stores both show_completed and sort_order flags in DataStore and exposes a Flow<UserPreferences>, let's update the TasksViewModel to use them.

Remove showCompletedFlow and sortOrderFlow and instead, create a value called userPreferencesFlow that gets initialized with userPreferencesRepository.userPreferencesFlow:

private val userPreferencesFlow = userPreferencesRepository.userPreferencesFlow

In the tasksUiModelFlow creation, replace showCompletedFlow and sortOrderFlow with userPreferencesFlow. Replace the parameters accordingly.

When calling filterSortTasks pass in the showCompleted and sortOrder of the userPreferences. Your code should look like this:

private val tasksUiModelFlow = combine(
        repository.tasks,
        userPreferencesFlow
    ) { tasks: List<Task>, userPreferences: UserPreferences ->
        return@combine TasksUiModel(
            tasks = filterSortTasks(
                tasks,
                userPreferences.showCompleted,
                userPreferences.sortOrder
            ),
            showCompleted = userPreferences.showCompleted,
            sortOrder = userPreferences.sortOrder
        )
    }

The showCompletedTasks() function should now be updated to call userPreferencesRepository.updateShowCompleted(). As this is a suspend function, create a new coroutine in the viewModelScope:

fun showCompletedTasks(show: Boolean) {
    viewModelScope.launch {
        userPreferencesRepository.updateShowCompleted(show)
    }
}

userPreferencesRepository functions enableSortByDeadline() and enableSortByPriority() are now suspend functions so they should also be called in a new coroutine, launched in the viewModelScope:

fun enableSortByDeadline(enable: Boolean) {
    viewModelScope.launch {
       userPreferencesRepository.enableSortByDeadline(enable)
    }
}

fun enableSortByPriority(enable: Boolean) {
    viewModelScope.launch {
        userPreferencesRepository.enableSortByPriority(enable)
    }
}

Clean up UserPreferencesRepository

Let's remove the fields and methods that are no longer needed. You should be able to delete the following:

  • _sortOrderFlow
  • sortOrderFlow
  • updateSortOrder()
  • private val sortOrder: SortOrder
  • private val sharedPreferences

Our app should now compile successfully. Let's run it to see if the show_completed and sort_order flags are saved correctly.

Check out the proto_datastore branch of the codelab repo to compare your changes.

10. Wrap up

Now that you migrated to Proto DataStore, let's recap what we've learned:

  • SharedPreferences comes with a series of drawbacks - from synchronous API that can appear safe to call on the UI thread, no mechanism of signaling errors, lack of transactional API and more.
  • DataStore is a replacement for SharedPreferences addressing most of the shortcomings of the API.
  • DataStore has a fully asynchronous API using Kotlin coroutines and Flow, handles data migration, guarantees data consistency and handles data corruption.