polymech-astro/packages/imagetools_3/docs/caching.md
2025-08-27 18:44:10 +02:00

179 lines
8.1 KiB
Markdown

# Solving Caching Conflicts with a Persistent Asset Manifest
When using `imagetools` with other build-caching tools that operate at a page level, such as `@domain-expansion/astro`, a conflict can arise that leads to missing images in the final production build. This document explains the problem and details a robust solution using a persistent asset manifest.
## The Problem: Missing Images in Production Builds
- **How Page Caching Works**: Tools like `@domain-expansion/astro` can cache the entire HTML output of a page. On subsequent builds, if the page content has not changed, the tool serves the cached HTML directly, skipping Astro's rendering process for that page.
- **The Conflict**: `imagetools` relies on its components (`<Img>`, `<Picture>`, etc.) being rendered during the Astro build. When a component renders, it triggers an image import, which is intercepted by the `imagetools` Vite plugin's `load` hook. This hook is responsible for two critical tasks:
1. Processing the image.
2. Adding the image to an in-memory `store` of assets to be included in the final build.
- **The Result**: When a page is served from a cache, its `imagetools` components do not render. This means the `load` hook is never called for the images on that page, and they are never added to the `store`. Consequently, the `astro:build:done` hook, which reads from this `store`, is unaware of these images, and they will be missing from the final production output (`dist` folder).
## Solution: A Persistent Asset Manifest
To solve this, `imagetools` needs to remember the assets it has processed across multiple builds, independent of the in-memory `store` which is reset for each build. This can be achieved by introducing a persistent **asset manifest**.
### Proposed Architecture Change
The solution involves modifying the Vite plugin and the Astro integration to use a persistent key-value store that persists between builds. We will use `keyv-file`, a lightweight file-based adapter for the `Keyv` store, which is simpler than a full database.
#### 1. Introduce a File-Based Manifest Store
A JSON file, `cwd/imagetools-manifest.json`, will be used to track every asset `imagetools` needs for the final build. Storing it in the project's current working directory (`cwd`) ensures it's consistently located.
#### 2. The Role of the `load.js` Hook
The Vite plugin's `load` hook, defined as `export default async function load(id)`, is the starting point for this entire process.
- **What is `id`?**: The `id` parameter is a string provided by Vite, representing the fully resolved path to the module being imported. For our purposes, it looks something like `file:///path/to/your/project/src/assets/image.jpg?w=800&format=webp`.
The hook's primary responsibilities are to:
1. **Parse the Import**: It uses `new URL()` to parse the `id`, separating the file path from the image transformation parameters in the query string (e.g., `w=800`, `format=webp`).
2. **Check Transformation Cache**: It checks its own persistent cache (powered by `@polymech/cache`) to see if this specific image transformation has already been processed in a previous build.
3. **Process or Retrieve Image**: If the image is not in the cache, it's transformed using Sharp. The resulting buffer and metadata are then cached for future builds.
4. **Populate In-Memory Store**: Crucially, it takes the final `imageObject` (containing the buffer, hash, etc.) and places it into the in-memory `store`, using the final asset path (e.g., `/assets/image-800.webp`) as the key.
This in-memory `store` now represents all assets that have been actively processed *in the current build*. This is the data that will be used to update the persistent manifest.
```javascript
// imagetools_3/plugin/hooks/load.js
// ...
export default async function load(id) {
// 1. Parse the import `id`
try {
var fileURL = new URL(`file://${id}`);
} catch (error) {
return null; // Not a file-based import, ignore.
}
const { searchParams } = fileURL;
const config = Object.fromEntries(searchParams);
// ... (logic to get assetPath)
// 2, 3. Check cache or transform image to get `imageObject`
let imageObject = await get_cached_object(cacheKey, 'imagetools-plugin');
if (!imageObject) {
// ... transform logic ...
imageObject = { hash, type, image, buffer };
await set_cached_object(cacheKey, 'imagetools-plugin', imageObject);
}
// 4. Populate the in-memory store, which feeds the persistent manifest.
store.set(assetPath, imageObject);
// ... (return path to Vite)
}
```
#### 3. Modify the `astro:build:done` Hook
The core of the solution is to update the `astro:build:done` hook in `integration/index.js` to manage the manifest.
##### Before (Current Implementation)
```javascript
// imagetools_3/integration/index.js
"astro:build:done": async function closeBundle() {
// ...
if (mode === "production") {
// Reads ONLY from the in-memory store, which is empty for cached pages.
const assetPaths = [...store.entries()].filter(
([, { hash = null } = {}]) => hash
);
await pMap(
assetPaths,
async ([assetPath, { hash, image, buffer }]) => {
await saveAndCopyAsset(
hash,
image,
buffer,
outDir,
assetsDir,
assetPath,
isSsrBuild
);
}
);
}
}
```
##### After (Proposed Implementation with keyv-file)
```javascript
// imagetools_3/integration/index.js
import path from "node:path";
import Keyv from "keyv";
import { KeyvFile } from "keyv-file";
"astro:build:done": async function closeBundle({ logger }) {
// ...
if (mode === "production") {
// 1. Initialize the persistent manifest store in the current working directory.
const manifestPath = path.resolve(process.cwd(), ".astro/imagetools-manifest.json");
const manifestStore = new Keyv({
store: new KeyvFile({ filename: manifestPath }),
namespace: "assets",
});
// 2. Get assets from the current build's in-memory store.
const currentAssets = new Map(
[...store.entries()].filter(([, { hash = null } = {}]) => hash)
);
// 3. Save/update assets from the current build into the persistent store.
for (const [assetPath, assetData] of currentAssets.entries()) {
await manifestStore.set(assetPath, assetData);
}
// 4. Collect all assets from the persistent store for processing.
const allAssets = [];
try {
for await (const [key, value] of manifestStore.iterator()) {
allAssets.push([key, value]);
}
} catch (error) {
logger.error("Failed to iterate over the asset manifest store.", error);
}
// 5. Process the entire list of assets from the manifest.
await pMap(
allAssets,
async ([assetPath, { hash, image, buffer }]) => {
await saveAndCopyAsset(
hash,
image,
buffer,
outDir,
assetsDir,
assetPath,
isSsrBuild
);
}
);
}
}
```
This approach makes `imagetools` resilient to page-level caching. Even if a page is cached, its required image assets are remembered from the last time it was rendered, ensuring they are always present in the build output.
### Cache Invalidation
- **Stale Assets**: This persistent manifest strategy means that if an image is removed from your project, it will not be automatically removed from the manifest.
- **Recommendation**: For a full, clean rebuild that clears all caches, you can delete the `cwd/.astro/imagetools-manifest.json` file and the `dist` directory. This will force `imagetools` and `@domain-expansion/astro` to re-process everything from scratch.
## Implementation Plan
Here is the step-by-step plan to implement the persistent caching solution.
- [x] Add `keyv` and `keyv-file` dependencies to `package.json`.
- [x] Implement the persistent asset manifest in `integration/index.js` using `keyv-file`.
- [ ] Verify the `load.js` hook correctly populates the in-memory `store`.
- [ ] Review and finalize caching documentation.