0x002 - Search as a Service 🔍

0x002 - Search as a Service 🔍

Hi, y’all! This is the second issue of unzip.dev, a newsletter dedicated to developer trends, where we unpack trending dev concepts. My name is Agam More, and I’m a developer generalist who loves learning & sharing. Join the ride and have fun!

I wanted to thank @TomGranot and @KtzAndy for their great insights on this issue.

Search as a Service

Heads up! Please note that for this newsletter issue, we’re adopting the term SeaaS to mean “Search as a Service” 🏴‍☠️

TL;DR:

  • Problem: Developing robust search functionality for your application is a challenging task.
  • Solution: Outsource the search function to an external (mostly hosted) search service, which includes indexing and sometimes pre-built UI - so you don’t have to build them.
  • In sum: Several websites already use SeaaS quite heavily, but pricing could be a dealbreaker.
SeaaS = Search as a Service

How does it work? 💡

  1. You push any piece of data you want to be indexed to the SeaaS.
  2. The SeaaS indexes the data #magic.
  3. You integrate the SeaaS in your application, in one of two ways:
    a. Using the pre-built search UI they provide.
    b. Connecting your own frontend with the SeaaS API.
  4. You configure your database to sync changes over time with the SeaaS, so your index is continuously up-to-date.

Use cases ✅

  • You need to add search functionality without wasting time on maintaining self-hosted Solr/Elastic services, or re-inventing the wheel.
  • Your regex and greps aren’t working well enough as a search mechanism - you want to improve your search.
  • You want advanced features like personalization or voice search.
  • Your data is spread across different data sources. Via a consistent API, you could aggregate the data, so they can be searchable in one place (This functionality is sometimes referred to as “Federated Search”).

Why? 🤔

  • Less work: No need to implement and maintain the search infrastructure, it's all managed. Focus on your core business.
    • Scaling is done by the SeaaS - if cost isn't a major concern, scaling is solved.
  • Enterprise-grade search: Building a robust search engine requires a lot of know-how and work, why not delegate it? (full-text search is hard, and no, regex won’t scale well 🧙‍♂️ WiseWords™)
  • Advanced features out-of-the-box:
    • Personalization, like user intention, “python” might mean a snake for one user and a programming language for another (see Algolia’s personalization)
    • Federated search (searching multiple data sources).
    • Voice search, mobile search, geo search, auto-completion.
  • Out-of-the-box analytics: Use pre-built dashboards and analytics that could give you insights generated from your user’s searches.
  • Compliance: No need to manually account for GDPR or CCPA - you save time on deletion requests, information inquiries and more chores.

Why not? 🙅

  • Special cases: General-purpose solutions might not fit all complex/specific use cases.
  • Expensive: Gets expensive at scale (see comments here and here on Algolia’s pricing)
  • Sensitive content: You have data that you don’t want 3rd parties to process.
  • On-premises: A requirement, unless you go the open-source path.

Tools & players 🛠️

  • Algolia - The OG SeaaS, closed-source and managed.
  • typesense - Open-source algolia alternative, also provides a hosted service.
  • swiftype - Elasticsearch’s take on algolia, simplifying elasticsearch.
  • yext - Self-described as “AI search” as a Service.
  • AWS CloudSearch - Using Solr behind the scenes.
  • Azure Cognitive Search - Azure’s take on SeaaS.
  • websolr - Hosted SeaaS based on Solr.
  • MeiliSearch - Open-source Rust based SeaaS.
  • Seekstorm - Seem to be a cheaper competitor to other hosted solutions.
  • search.io - Yet another SeaaS.
  • appbase.io - SeaaS on top of Elasticsearch with a search-tweaking interface.
🤠
My opinion: I would probably go self-hosted with typesense. Price is a big issue with most of the hosted versions. If I do need the maintenance wizardry one requires at scale - I would simply upgrade to one of the hosted/cloud solutions (based on the most competitive price at that point in time).

Note: I will indicate any kind of ad or sponsorship (none here), and no, it won’t affect my choices unfairly.

Forecast 🧞

  • Pricing changes: It seems like the current SeaaS pricing is a hard sell for mass-market adoption (this is what SeekStorm are building their pricing model around). Until then, self-hosted solutions or less feature-rich solutions like the cloud providers’ services will be the reasonable way to go.

Who uses it? 🎡

Extra

  • For search UI components, you might want to check Reactive Search (React and Vue supported).
  • If you are interested in the system design behind search engines, check out this article and also its follow-up article.
  • You might want to check Benjamin Read’s summary of players.
  • Fireship, one of my favorite YouTube channels just released a video about how to use the Redis search module.

As a developer, it’s really hard to gauge the quality of my writing. It feels like using those old room-sized, punched-card computers where you had to wait all weekend for the results and only then get feedback 😪 So your comments are, truly, most appreciated:

Tweet at me @agammore or simply reply to this e-mail.

Why not help your friends by sharing this with them? ❤️