What you'll build

In this codelab, you start with a sample app that already displays a list of GitHub repositories, loading data from the database and that is backed by network data. Whenever the user scrolls and gets to the end of the displayed list, a new network request is triggered and its result is saved in the database.

You will add code through a series of steps, integrating the Paging library components as you progress. These components are described in Step 2.

What you'll need

In this step, you will download the code for the entire codelab and then run a simple example app.

Click the following button to download all the code for this codelab:

Download source code

  1. Unzip the code, and then open the project Android Studio version 3.0 or newer.
  2. Run the SearchRepositoriesActivity run configuration on a device or emulator.

The app runs and displays a list of GitHub repositories similar to this one:

You can also checkout the codelab on GitHub. The initial state is on the master branch and the see the solution on the solution branch.

The Paging Library makes it easier for you to load data gradually and gracefully within your app's UI.

The Guide to App Architecture proposes an architecture with the following main components:

The Paging library works with all of these components and coordinates the interactions between them, so that it can page content from a data source and display that content in the UI.

This codelab introduces you to the Paging library and its main components:

In this codelab, you implement examples of each of the components described above.

The app allows you to search GitHub for repositories whose name or description contains a specific word. The list of repositories is displayed, in descending order based on the number of stars, then by the name. The database is the source of truth for data that is displayed by the UI, and it's backed by network requests.

The list of repositories, by name, is retrieved via a LiveData object in RepoDao.reposByName. Whenever new data from the network is inserted into the database, the LiveData will emit again with the entire result of the query.

The current implementation has two memory/performance issues:

The app follows the architecture recommended in the "Guide to App Architecture", using Room as local data storage. Here's what you will find in each package:

In our current implementation, we use a LiveData<List<Repo>> to get the data from the database and pass it to the UI. Whenever the data from the local database is modified, the LiveData emits an updated list. The alternative to List<Repo> is a PagedList<Repo>. A PagedList is a version of a List that loads content in chunks. Similar to the List, the PagedList holds a snapshot of content, so updates occur when new instances of PagedList are delivered via LiveData.

When a PagedList is created, it immediately loads the first chunk of data and expands over time as content is loaded in future passes. The size of the PagedList is the number of items loaded during each pass. The class supports both infinite lists and very large lists with a fixed number of elements.

Replace occurrences of List<Repo> with PagedList<Repo>:

viewModel.repos.observe(this, Observer<PagedList<Repo>> {
            showEmptyList(it?.size == 0)
            adapter.submitList(it)
 })

The PagedList loads content dynamically from a source. In our case, because the database is the main source of truth for the UI, it also represents the source for the PagedList. If your app gets data directly from the network and displays it without caching, then the class that makes network requests would be your data source.

A source is defined by a DataSource class. To page in data from a source that can change—such as a source that allows inserting, deleting or updating data—you will also need to implement a DataSource.Factory that knows how to create the DataSource. Whenever the data set is updated, the DataSource is invalidated and re-created automatically through the DataSource.Factory.

The Room persistence library provides native support for data sources associated with the Paging library. For a given query, Room allows you to return a DataSource.Factory from the DAO and handles the implementation of the DataSource for you.

Update the code to get a DataSource.Factory from Room:

fun reposByName(queryString: String): DataSource.Factory<Int, Repo>

To build and configure a LiveData<PagedList>, use a LivePagedListBuilder. Besides the DataSource.Factory, you need to provide a PagedList configuration, which can include the following options:

Update GithubRepository to build and configure a paged list:

companion object {
        private const val NETWORK_PAGE_SIZE = 50
        private const val DATABASE_PAGE_SIZE = 20
}

In GithubRepository.search() method, make the following changes:

// Get data source factory from the local cache
val dataSourceFactory = cache.reposByName(query)
fun search(query: String): RepoSearchResult {
    // Get data source factory from the local cache
    val dataSourceFactory = cache.reposByName(query)

    // Get the paged list
    val data = LivePagedListBuilder(dataSourceFactory, DATABASE_PAGE_SIZE).build()

     // Get the network errors exposed by the boundary callback
     return RepoSearchResult(data, networkErrors)
}

To bind a PagedList to a RecycleView, use a PagedListAdapter. The PagedListAdapter gets notified whenever the PagedList content is loaded and then signals the RecyclerView to update.

Update the ReposAdapter to work with a PagedList:

class ReposAdapter : PagedListAdapter<Repo, RecyclerView.ViewHolder>(REPO_COMPARATOR)

Our app finally compiles! Run it, and check out how it works.

Currently, we use a OnScrollListener attached to the RecyclerView to know when to trigger more data. We can let the Paging library handle list scrolling for us, though.

Remove the custom scroll handling:

After removing the custom scroll handling, our app has the following behavior:

A problem appears when the data source doesn't have any more data to give us, either because zero items were returned from the initial loading of the data or because we've reached the end of the data from the DataSource. To resolve this issue, implement a BoundaryCallback. This class notifies us when either situation occurs, so we know when to request more data. Because our DataSource is a Room database, backed by network data, the callbacks let us know that we should request more data from the API.

Handle data loading with BoundaryCallback:

class RepoBoundaryCallback(
        private val query: String,
        private val service: GithubService,
        private val cache: GithubLocalCache
) : PagedList.BoundaryCallback<Repo>() {
    override fun onZeroItemsLoaded() {
    }

    override fun onItemAtEndLoaded(itemAtEnd: Repo) {
    }
}
// keep the last requested page. 
// When the request is successful, increment the page number.
private var lastRequestedPage = 1

private val _networkErrors = MutableLiveData<String>()
// LiveData of network errors.
val networkErrors: LiveData<String>
     get() = _networkErrors

// avoid triggering multiple requests in the same time
private var isRequestInProgress = false
override fun onZeroItemsLoaded() {
    requestAndSaveData(query)
}

override fun onItemAtEndLoaded(itemAtEnd: Repo) {
    requestAndSaveData(query)
}

Update GithubRepository to use the BoundaryCallback when creating the PagedList:

fun search(query: String): RepoSearchResult {
    Log.d("GithubRepository", "New query: $query")

    // Get data source factory from the local cache
    val dataSourceFactory = cache.reposByName(query)
        
    // Construct the boundary callback
    val boundaryCallback = RepoBoundaryCallback(query, service, cache)
    val networkErrors = boundaryCallback.networkErrors

    // Get the paged list
    val data = LivePagedListBuilder(dataSourceFactory, DATABASE_PAGE_SIZE)
             .setBoundaryCallback(boundaryCallback)
             .build()

    // Get the network errors exposed by the boundary callback
    return RepoSearchResult(data, networkErrors)
}

That's it! With the current setup, the Paging library components are the ones triggering the API requests at the right time, saving data in the database, and displaying the data. So, run the app and search for repositories.

Now that we added all the components, let's take a step back and see how everything works together.

The DataSource.Factory (implemented by Room) creates the DataSource. Then, LivePagedListBuilder builds the LiveData<PagedList>, using the passed-in DataSource.Factory, BoundaryCallback, and PagedList configuration. This LivePagedListBuilder object is responsible for creating PagedList objects. When a PagedList is created, two things happen at the same time:

After the data is inserted in the DataSource, a new PagedList object is created (represented in the following animation by a filled-in square). This new data object is then passed to the ViewModel and UI using LiveData and displayed with the help of the PagedListAdapter.

When the user scrolls, the PagedList requests that the DataSource load more data, querying the database for the next chunk of data. When the PagedList paged all the available data from the DataSource, BoundaryCallback.onItemAtEndLoaded() is called. The BoundaryCallback requests data from the network and inserts the response data in the database. The UI then gets re-populated based on the newly-loaded data.