8 min read
Original source

Offset and keyset pagination with raw SQL queries

So far, we have returned the full content of our tables. However, as our database grows, this might prove not to be the best approach in terms of performance.…

So far, we have returned the full content of our tables. However, as our database grows, this might prove not to be the best approach in terms of performance. A popular solution is to serve the data in chunks by presenting multiple pages or implementing infinite scrolling. In this article, we implement its back-end aspect using NestJS and PostgreSQL. We also compare various approaches to achieving it and point out their advantages and disadvantages. You can find the code from this article in this repository. Offset and limit Let’s start by investigating this simple query:SELECT id, title FROM postsIt returns all of the records from the posts table.   A significant thing to acknowledge is that the order of the above rows is not guaranteed. However, when implementing pagination, we depend on the order of rows to be predictable. Therefore, we should use the ORDER BY clause.SELECT id, title FROM posts ORDER BY id ASC To start paginating our data, we need to limit the number of rows in our query. To do that, we need the LIMIT statement.SELECT id, title FROM posts ORDER BY id ASC LIMIT 10Thanks to the above, we now get the first ten items instead all of them. This allows us to present the user with the first page of results. To serve the second page of the data, we need to specify the starting point of our query. We can use the OFFSET keyword to specify how many rows we want to skip.SELECT id, title FROM posts ORDER BY id ASC OFFSET 10 LIMIT 10 Above, we omit the first ten posts and get ten posts in the results. In our case, it gives us entities with ids from 11 to 20. This is where the order of our data plays a significant role. We can easily modify it by changing the ORDER BY clause, but keeping some order is important. Counting the number of rows It is a common approach to display the number of data pages to the user. For example, if we have fifty rows and display ten per page, we have five data pages. To do the above, we need to know the number of rows of data in our table. To do that, we can use the COUNT keyword.SELECT COUNT(*) AS all_posts_count FROM posts We can use the COUNT keyword while selecting data from some columns. When doing that, we need to specify the section of the data we are counting by partitioning it. For example, we can count the number of posts by a certain author.SELECT author_id, COUNT(*) OVER (PARTITION BY author_id) AS author_posts_count FROM postsTo present the results in a readable way, let’s only display one row per author using the DISTINCT keyword.SELECT DISTINCT author_id, COUNT(*) OVER (PARTITION BY author_id) AS author_posts_count FROM posts Above, we can see that the author with id 3 wrote two posts, and the author with id 2 wrote forty posts. In our case, we want to count the total number of posts. However, even though that’s the case, we still need to use the OVER clause.SELECT id, title, COUNT(*) OVER() AS total_posts_count FROM posts ORDER BY id ASC OFFSET 10 LIMIT 10 Grouping and partitioning data with the OVER() function is a good topic for a separate article. The whole idea is to count the number of rows and fetch their details in the same transaction to keep the integrity of the data. When we run a single query, PostgreSQL wraps it in a transaction out of the box. We can define a transaction separately if we want to count the posts in a separate SELECT statement.BEGIN; SELECT id, title FROM posts ORDER BY id ASC OFFSET 10 LIMIT 10; SELECT COUNT(*) AS total_posts_count FROM posts; COMMIT; If you want to know more about transactions, check out API with NestJS #76. Working with transactions using raw SQL queries It is also important to notice that PostgreSQL returns the result of COUNT as big int. The maximum value of a regular integer is 2³¹⁻¹ (2,147,483,647), and for a big integer, it is 2⁶³⁻¹ (9,223,372,036,854,775,807). Unfortunately, JavaScript does not know how to parse big integers to JSON out of the box.const data = { value: BigInt(10) } JSON.stringify(data); Uncaught TypeError: Do not know how to serialize a BigInt If we don’t expect our table to hold more than 2,147,483,647 elements, we can cast the result of COUNT(*) to a regular integer.SELECT COUNT(*) OVER()::int AS total_posts_count FROM posts Implementing offset pagination with NestJS When implementing the offset pagination with NestJS, we expect the user to provide the offset and limit as query parameters. To handle that, we can create a designated class. paginationParams.ts import { IsNumber, Min, IsOptional } from 'class-validator'; import { Type } from 'class-transformer'; class PaginationParams { @IsOptional() @Type(() => Number) @IsNumber() @Min(0) offset?: number; @IsOptional() @Type(() => Number) @IsNumber() @Min(1) limit?: number; } export default PaginationParams;We then use it in our controller. posts.controller.ts import { ClassSerializerInterceptor, Controller, Get, Query, UseInterceptors, } from '@nestjs/common'; import { PostsService } from './posts.service'; import GetPostsByAuthorQuery from './getPostsByAuthorQuery'; import PaginationParams from '../utils/paginationParams'; @Controller('posts') @UseInterceptors(ClassSerializerInterceptor) export default class PostsController { constructor(private readonly postsService: PostsService) {} @Get() getPosts( @Query() { authorId }: GetPostsByAuthorQuery, @Query() { offset, limit }: PaginationParams, ) { return this.postsService.getPosts(authorId, offset, limit); } // ... }The last step is to implement the logic in our PostsRepository class. posts.repository.ts import { Injectable, } from '@nestjs/common'; import DatabaseService from '../database/database.service'; import PostModel from './post.model'; @Injectable() class PostsRepository { constructor(private readonly databaseService: DatabaseService) {} async get(offset = 0, limit: number | null = null) { const databaseResponse = await this.databaseService.runQuery( ` SELECT id, title, COUNT(*) OVER()::int AS total_posts_count FROM posts ORDER BY id ASC OFFSET $1 LIMIT $2 `, [offset, limit], ); const items = databaseResponse.rows.map( (databaseRow) => new PostModel(databaseRow), ); const count = databaseResponse.rows[0]?.total_posts_count || 0; return { items, count, }; } // ... } export default PostsRepository;A significant thing above is that we provide default values for offset and limit: providing for offset means that we don’t intend to skip any rows, by setting the limit to null, we state that we don’t want to limit the results. Doing all of the above, we end up with fully functional offset pagination. Disadvantages The offset and limit approach to pagination is widely used. Unfortunately, it has some significant disadvantages. The most important caveat is that the database needs to compute all of the rows skipped by the OFFSET keyword. This can take a toll on the performance: first, the database sorts all of the rows as specified in the ORDER BY clause, then, PostgreSQL drops the number of rows specified in the OFFSET. Aside from the above issue, we can run into a problem with consistency: the first user fetches page number one with posts, the second user creates a new post that ends up on page number one, the first user fetches the second page. Unfortunately, the above operations cause the first user to see the last element of the first page again on the second page. Besides that, the user missed the element added to the first page. Advantages The offset approach is very common and straightforward to implement. It is also very easy to change the column we use for sorting, including multiple columns. It makes it an acceptable solution in many cases, especially if the offset is not expected to be big and the data inconsistencies are acceptable. Keyset pagination We can take another approach to pagination by filtering out the data we’ve already seen using the WHERE keyword instead of OFFSET. First, let’s run the following query:SELECT id, title FROM posts ORDER BY id ASC LIMIT 10 In the results, we can see that the last post has an id of 10. We can now use this knowledge to request posts with the id bigger than 10.SELECT id, title FROM posts WHERE id > 10 ORDER BY id ASC LIMIT 10 To get the next page of results, we need to inspect the above results and notice that the id of the last row is 20. We can use that to modify our WHERE clause.SELECT id, title FROM posts WHERE id > 20 ORDER BY id ASC LIMIT 10Unfortunately, this exposes the most significant disadvantages of the keyset pagination. To get a chunk of data, we need to know the id of the last element of the previous chunk. This makes traversing more than one page at once impossible. To change the column by which we order our elements, we need to modify both ORDER BY and WHERE clauses. Counting the number of rows It is crucial to notice that using the WHERE clause affects the rows counted with COUNT(*). To deal with this issue, we need to count the rows separately. We can create an explicit transaction or use a Common Table Expression query using the WITH statement.WITH selected_posts AS ( SELECT id, title FROM posts WHERE id > 10 ORDER BY id ASC LIMIT 10 ), total_posts_count_response AS ( SELECT COUNT(*) AS total_posts_count FROM posts ) SELECT * FROM selected_posts, total_posts_count_response Implementing keyset pagination with NestJS First, let’s modify our PaginationParams class to accept an additional query parameter. paginationParams.ts import { IsNumber, Min, IsOptional } from 'class-validator'; import { Type } from 'class-transformer'; class PaginationParams { @IsOptional() @Type(() => Number) @IsNumber() @Min(0) offset?

Offset and keyset pagination with raw SQL queries | NestJS.io