polymech-astro/packages/imagetools_3/docs/caching.md
2025-08-27 18:44:10 +02:00

8.1 KiB

Solving Caching Conflicts with a Persistent Asset Manifest

When using imagetools with other build-caching tools that operate at a page level, such as @domain-expansion/astro, a conflict can arise that leads to missing images in the final production build. This document explains the problem and details a robust solution using a persistent asset manifest.

The Problem: Missing Images in Production Builds

  • How Page Caching Works: Tools like @domain-expansion/astro can cache the entire HTML output of a page. On subsequent builds, if the page content has not changed, the tool serves the cached HTML directly, skipping Astro's rendering process for that page.

  • The Conflict: imagetools relies on its components (<Img>, <Picture>, etc.) being rendered during the Astro build. When a component renders, it triggers an image import, which is intercepted by the imagetools Vite plugin's load hook. This hook is responsible for two critical tasks:

    1. Processing the image.
    2. Adding the image to an in-memory store of assets to be included in the final build.
  • The Result: When a page is served from a cache, its imagetools components do not render. This means the load hook is never called for the images on that page, and they are never added to the store. Consequently, the astro:build:done hook, which reads from this store, is unaware of these images, and they will be missing from the final production output (dist folder).

Solution: A Persistent Asset Manifest

To solve this, imagetools needs to remember the assets it has processed across multiple builds, independent of the in-memory store which is reset for each build. This can be achieved by introducing a persistent asset manifest.

Proposed Architecture Change

The solution involves modifying the Vite plugin and the Astro integration to use a persistent key-value store that persists between builds. We will use keyv-file, a lightweight file-based adapter for the Keyv store, which is simpler than a full database.

1. Introduce a File-Based Manifest Store

A JSON file, cwd/imagetools-manifest.json, will be used to track every asset imagetools needs for the final build. Storing it in the project's current working directory (cwd) ensures it's consistently located.

2. The Role of the load.js Hook

The Vite plugin's load hook, defined as export default async function load(id), is the starting point for this entire process.

  • What is id?: The id parameter is a string provided by Vite, representing the fully resolved path to the module being imported. For our purposes, it looks something like file:///path/to/your/project/src/assets/image.jpg?w=800&format=webp.

The hook's primary responsibilities are to:

  1. Parse the Import: It uses new URL() to parse the id, separating the file path from the image transformation parameters in the query string (e.g., w=800, format=webp).
  2. Check Transformation Cache: It checks its own persistent cache (powered by @polymech/cache) to see if this specific image transformation has already been processed in a previous build.
  3. Process or Retrieve Image: If the image is not in the cache, it's transformed using Sharp. The resulting buffer and metadata are then cached for future builds.
  4. Populate In-Memory Store: Crucially, it takes the final imageObject (containing the buffer, hash, etc.) and places it into the in-memory store, using the final asset path (e.g., /assets/image-800.webp) as the key.

This in-memory store now represents all assets that have been actively processed in the current build. This is the data that will be used to update the persistent manifest.

// imagetools_3/plugin/hooks/load.js
// ...
export default async function load(id) {
  // 1. Parse the import `id`
  try {
    var fileURL = new URL(`file://${id}`);
  } catch (error) {
    return null; // Not a file-based import, ignore.
  }

  const { searchParams } = fileURL;
  const config = Object.fromEntries(searchParams);

  // ... (logic to get assetPath)

  // 2, 3. Check cache or transform image to get `imageObject`
  let imageObject = await get_cached_object(cacheKey, 'imagetools-plugin');
  if (!imageObject) {
    // ... transform logic ...
    imageObject = { hash, type, image, buffer };
    await set_cached_object(cacheKey, 'imagetools-plugin', imageObject);
  }

  // 4. Populate the in-memory store, which feeds the persistent manifest.
  store.set(assetPath, imageObject);

  // ... (return path to Vite)
}

3. Modify the astro:build:done Hook

The core of the solution is to update the astro:build:done hook in integration/index.js to manage the manifest.

Before (Current Implementation)
// imagetools_3/integration/index.js

"astro:build:done": async function closeBundle() {
  // ...
  if (mode === "production") {
    // Reads ONLY from the in-memory store, which is empty for cached pages.
    const assetPaths = [...store.entries()].filter(
      ([, { hash = null } = {}]) => hash
    );

    await pMap(
      assetPaths,
      async ([assetPath, { hash, image, buffer }]) => {
        await saveAndCopyAsset(
          hash,
          image,
          buffer,
          outDir,
          assetsDir,
          assetPath,
          isSsrBuild
        );
      }
    );
  }
}
After (Proposed Implementation with keyv-file)
// imagetools_3/integration/index.js
import path from "node:path";
import Keyv from "keyv";
import { KeyvFile } from "keyv-file";

"astro:build:done": async function closeBundle({ logger }) {
  // ...
  if (mode === "production") {
    // 1. Initialize the persistent manifest store in the current working directory.
    const manifestPath = path.resolve(process.cwd(), ".astro/imagetools-manifest.json");
    const manifestStore = new Keyv({
      store: new KeyvFile({ filename: manifestPath }),
      namespace: "assets",
    });

    // 2. Get assets from the current build's in-memory store.
    const currentAssets = new Map(
      [...store.entries()].filter(([, { hash = null } = {}]) => hash)
    );

    // 3. Save/update assets from the current build into the persistent store.
    for (const [assetPath, assetData] of currentAssets.entries()) {
      await manifestStore.set(assetPath, assetData);
    }

    // 4. Collect all assets from the persistent store for processing.
    const allAssets = [];
    try {
      for await (const [key, value] of manifestStore.iterator()) {
        allAssets.push([key, value]);
      }
    } catch (error) {
      logger.error("Failed to iterate over the asset manifest store.", error);
    }

    // 5. Process the entire list of assets from the manifest.
    await pMap(
      allAssets,
      async ([assetPath, { hash, image, buffer }]) => {
        await saveAndCopyAsset(
          hash,
          image,
          buffer,
          outDir,
          assetsDir,
          assetPath,
          isSsrBuild
        );
      }
    );
  }
}

This approach makes imagetools resilient to page-level caching. Even if a page is cached, its required image assets are remembered from the last time it was rendered, ensuring they are always present in the build output.

Cache Invalidation

  • Stale Assets: This persistent manifest strategy means that if an image is removed from your project, it will not be automatically removed from the manifest.
  • Recommendation: For a full, clean rebuild that clears all caches, you can delete the cwd/.astro/imagetools-manifest.json file and the dist directory. This will force imagetools and @domain-expansion/astro to re-process everything from scratch.

Implementation Plan

Here is the step-by-step plan to implement the persistent caching solution.

  • Add keyv and keyv-file dependencies to package.json.
  • Implement the persistent asset manifest in integration/index.js using keyv-file.
  • Verify the load.js hook correctly populates the in-memory store.
  • Review and finalize caching documentation.