$ emrebener
home personal-projects dynamicfacetsearch

DynamicFacetSearch

published: updated: type: study

DynamicFacetSearch is a small Node.js + MongoDB product catalogue built to explore one specific question: what does a faceted-search UI look like when the facets aren’t predefined? In a typical e-commerce stack, every category has its own schema — laptops have CPU and RAM, books have author and genre, bicycles have frame size and gear count — and the filter sidebar is hand-coded per category. This app inverts that: products carry a free-form attributes object, and the facet panel is derived at request time from whatever’s actually stored.

It’s a learning project rather than a production catalogue. The interesting bit isn’t the size — it’s the design choice that lets the whole thing fit in ~770 lines of JavaScript without a single per-category code path.

Schema-on-read facets

Each product is stored as a single MongoDB document with a fixed set of top-level fields (name, category, price, condition) and a flexible attributes subdocument that can carry whatever makes sense for that category:

CategoryAttribute keys
Laptopsbrand, cpu, ramGb, storageGb, screenSizeInches, ports
Headphonesbrand, noiseCancelling, batteryLifeHours, connection, colour
Booksauthor, genre, format, pageCount, language
Bicyclesbrand, bikeType, frameSize, wheelSizeInches, gearCount

There is no schema declaration anywhere that names these keys. The seed data carries them, the user adds new ones via a JSON textarea on the new-product form, and the database stores them as-is. Different products in the same category can even have different keys — attributes is whatever the document says it is.

The facet panel is built by walking every product in the currently-selected category and unioning their attributes:

for (const product of products) {
  const attributes = product.attributes || {};
  for (const [key, rawValue] of Object.entries(attributes)) {
    if (!facetValuesByKey.has(key)) {
      facetValuesByKey.set(key, new Set());
    }
    const values = Array.isArray(rawValue) ? rawValue : [rawValue];
    for (const value of values) {
      facetValuesByKey.get(key).add(stringifyFacetValue(value));
    }
  }
}

A few details that matter:

  • Array values flatten. If a product’s ports is ["USB-C", "HDMI"], both values become facet options under the same ports key — same shape MongoDB queries already understand.
  • All facet values are stringified for the UI but parsed back when filtering: "true"true, "16"16, anything else stays as a string. Booleans, numbers, and string facets all work without a per-type code path.
  • Facets come from the category, not the filtered set. The facet panel reflects what’s available in the category as a whole, so it doesn’t shrink to “no filters available” the moment you select one — the behaviour every faceted-search UI users expect.

The query construction is the inverse: each user-selected facet (key, value) pair becomes an attributes.${key} Mongo query path with the parsed primitive on the right-hand side. Because MongoDB stores documents as BSON with no schema enforcement, a query like {"attributes.cpu": "M3 Pro"} works regardless of which products carry a cpu key — the rest are silently excluded.

In a relational database this design would either become an EAV anti-pattern (one row per attribute) or a per-category schema the application has to know about. Mongo’s document model is what makes the schema-on-read approach feasible at this size.

Operational packaging

The whole thing runs as two containers from a single docker compose up:

  • App container — Node.js 22, Express 5, EJS templates. Listens on port 3010 host-side.
  • MongoDB container — internal-only on the Compose network, never published to the host.

A separate Vagrant path provisions an Ubuntu 22.04 VM with Docker preinstalled (libvirt or VirtualBox), runs the same Compose file inside the VM, and forwards 127.0.0.1:3010 to the guest. Both paths exist deliberately — the project was built to demonstrate the same image running on the host and inside an isolated VM, with no app-level changes between them.

The seed script is idempotent: it clears the demo dataset ({ demoSeed: true }) before inserting twelve fresh products across four categories, so re-running it produces a clean state without nuking the entire collection.

CI/CD with GitFlow

The Jenkins pipeline runs on every push and follows a strict staircase:

  1. Checkout
  2. npm ci
  3. npm test — Jest unit tests
  4. docker compose build
  5. Conditional deploy. If the branch is develop, run scripts/deploy-test.sh to deploy the app + Mongo stack to the test environment. Other branches stop after the build stage.

The conditional deploy is what makes this GitFlow-aware. feature/* and hotfix/* branches build and test but don’t touch any environment. develop integrates into the test environment automatically. main is treated as stable and isn’t deployed by CI — promotion to anything beyond test is a manual decision, not a webhook side-effect.