Travis
Nov 07 2021 at 12:53 GMT
I have a blogging app which uses PostgreSQL as database and Prisma as ORM.
I want to implement full text search so that a user can type something into a search bar, and relevant posts should be shown in the search results based on the post title and the post body.
Ideally, the search results are ranked by the number of occurrences of the search keywords and more importance is given to the titles than to the bodies.
Here is my Post
model with the relevant fields from the Prisma schema:
model Post {
id String @id @default(cuid())
title String @db.VarChar(255)
markdownBody String
plainTextBody String
excerpt String @db.VarChar(255)
deletedAt DateTime?
}
The search results should show the title and excerpt of the matching posts.
Also, the results should not include deleted posts, i.e., posts that have a non-nulldeletedAt
column.
Will
Nov 07 2021 at 13:47 GMT
PostgreSQL has built-in support for full text search, which can be used to achieve what you want. However, Prisma has only a limited support for PostgreSQL full text search, which is not enough to achieve what you want.
To be able to search through multiple columns at the same time (post title and body) and to rank the results, we will need to use PostgreSQL's tsvector
, tsquery
, ts_rank
, etc. This should not be an issue since Prisma allows adding unsupported database features.
Basically, we will need to write some raw SQL.
First, we need to add a textSearch
column with type TSVECTOR
to the Post
model, which will be used for a weighed full text search in the title and body:
model Post {
...
textSearch Unsupported("TSVECTOR")?
}
Notice that the TSVECTOR
type needs to be wrapped with Unsupported
and marked as optional.
We'll also need to add an index to it:
model Post {
...
textSearch Unsupported("TSVECTOR")?
@@index([textSearch])
}
Next, let's run prisma migrate
with the --create-only
flag so that we can tweak the generated migration file to fit our needs:
npx prisma migrate dev --create-only --name add-full-text-search-to-post
The migration file that Prisma generated by default will look like this:
-- AlterTable
ALTER TABLE "Post" ADD COLUMN "textSearch" TSVECTOR;
-- CreateIndex
CREATE INDEX "Post_textSearch_idx" ON "Post"("textSearch");
We first need to replace the ALTER TABLE
command with this:
ALTER TABLE "Post" ADD COLUMN "textSearch" TSVECTOR
GENERATED ALWAYS AS
(setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce("plainTextBody", '')), 'B'))
STORED;
Here's a very simple explanation of what this does:
This effectively makes the textSearch
column have an automatically generated TSVECTOR
that stores word occurrences from the post title
and plainTextBody
, giving more weight to the title
(weight=A
) than to the plainTextBody
(weight=B
).
We also need to replace the CREATE INDEX
command to create a GIN
index:
CREATE INDEX "Post_textSearch_idx" ON "Post" USING GIN ("textSearch");
The modified migration file should look like this:
-- AlterTable
ALTER TABLE "Post" ADD COLUMN "textSearch" TSVECTOR
GENERATED ALWAYS AS
(setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
setweight(to_tsvector('english', coalesce("plainTextBody", '')), 'B'))
STORED;
-- CreateIndex
CREATE INDEX "Post_textSearch_idx" ON "Post" USING GIN ("textSearch");
Now we can execute this migration with
npx prisma migrate dev
We are done with the full text search setup.
Let's now see how to make a full text search query.
First, we should transform the search phrase entered by the user into a tsquery
. For example, if the user typed a vegetarian salad recipe, we should transform this into the following string:
"a | vegetarian | salad | recipe"
The pipe (|
) is acting as an OR.
The resulting tsquery
will match posts that have at least one of the specified keywords. This should not be an issue since the matching posts will be ranked, so the ones with the most keyword occurrences will be on the top of the results.
However, if you want a more strict condition, you can use &
(AND) instead of |
(OR). So, if a post has the keywords vegetarian and salad, but not recipe, it will not match. Only a post that has all the 3 keywords will match (a is considered a stop word and it is ignored in the context of full text search).
The entered search phrase should also be sanitized so that special characters are stripped. Here's a simple version of this (you should probably come up with something more robust):
const tsquerySpecialChars = /[()|&:*!]/g;
const getQueryFromSearchPhrase = (searchPhrase: string) =>
searchPhrase
.replace(tsquerySpecialChars, " ")
.trim()
.split(/\s+/)
.join(" | ");
Finally, let's write a searchPosts
function that takes the searchPhrase
and returns the title and excerpt of the most relevant posts (up to the first 10):
async function searchPosts(searchPhrase: string) {
const query = getQueryFromSearchPhrase(searchPhrase);
const results = await prisma.$queryRaw`
SELECT title, excerpt FROM "Post"
WHERE
"deletedAt" IS NULL AND
"textSearch" @@ to_tsquery('english', ${query})
ORDER BY ts_rank("textSearch", to_tsquery('english', ${query})) DESC
LIMIT 10;
`;
return results as Pick<Post, "title" | "excerpt">[];
}
That's it!
This is a simple implementation of full text search that shows how it can be done in Prisma with PostgreSQL.
If you want to write a more advanced implementation, refer to the PostgreSQL full text search documentation.