Overview
Data enrichment is the process of taking raw product records and progressively improving their quality through the pipeline stages. This guide covers the full enrichment flow — from initial import through Bronze, Silver normalization, and Gold scoring — plus advanced enrichment via Bright Data web scraping.Step 1 — Import (Bronze)
The first step is getting your data into the system. All imports land in the Bronze layer: raw, unmodified, and idempotent.Import methods
| Method | Best for |
|---|---|
| CSV/Excel upload | Existing supplier spreadsheets |
| URL import | Scraping product pages directly from the web |
API (POST /api/workspace/{workspaceId}/catalogs/{catalogId}/products) | Programmatic feeds |
| MCP feed | Real-time catalog sync from external systems |
pipeline_stage: "bronze".
Step 2 — Silver (Normalize)
Silver cleans and normalizes your raw data: standardizes casing, validates image URLs, detects duplicates, and maps fields to the Alana schema.Run Silver via UI
- Open your catalog
- Select the products you want to normalize (or use Select All)
- Click Batch Actions → Normalize
- A progress indicator shows normalized / total
- When complete, review the results panel: fields mapped, duplicates found, broken URLs
Run Silver via API
Silver results
After Silver runs, each product record contains:fieldsNormalized— count of fields that were transformedduplicateOf— product ID if a duplicate was detectedurlsValidated— count of image/media URLs checkedpipeline_stage: "silver"
Step 3 — Gold (Score & Analyze)
Gold produces an optimization score (0–100) and a gap list — the fields that, if filled, would most increase the score.Run Gold via UI
- Select products in the catalog
- Click Batch Actions → Analyze
- A progress indicator shows analyzed / total
- When complete, each product shows its score badge and gap highlights
Run Gold via API
Step 4 — Review scores in Canvas
After Gold, open the Canvas to review and act on enrichment results:- Sort products by score (ascending) to find the lowest-quality items
- For each product, the Gaps panel shows exactly which fields are missing
- Use the inline editor to fill gaps directly in Canvas
- Re-run Gold on edited products to update the score
Step 5 — Bright Data enrichment
For products with incomplete data, Alana integrates with Bright Data to enrich via web scraping, SERP analysis, and dataset presets.URL import with web scraping
When you import via a product URL, Bright Data’s Web Scraper extracts structured data directly from the product page.SERP analysis for SEO
Use Bright Data’s SERP API to discover high-value keywords for your products. Results are used to optimize titles, descriptions, and meta fields during Gold scoring.Dataset presets for competitive analysis
Bright Data dataset presets pull structured competitor data for categories like electronics, apparel, and home goods — useful for benchmarking your catalog against market leaders. Available presets:electronics_specs— technical specifications from major retailersapparel_sizing— size charts and fit datagrocery_nutrition— nutritional information and ingredientshome_goods_dimensions— physical dimensions and materials
Full enrichment flow
Best practices
Always run Silver before Gold
Always run Silver before Gold
Gold scoring relies on normalized data. Running Gold on raw Bronze data produces artificially low scores because fields like brand and category are not linked yet.
Use SERP enrichment for top-priority products
Use SERP enrichment for top-priority products
SERP analysis has a cost per query. Run it on your highest-traffic SKUs or new product launches, not bulk catalogs.
Prioritize gaps by score impact
Prioritize gaps by score impact
The gaps list is ordered by impact. Filling the first gap will increase the score more than filling the last. Focus on the top 2–3 gaps per product.
Re-run Gold after every edit session
Re-run Gold after every edit session
Scores are not live — they reflect the last time Gold ran. Re-run Gold after filling gaps to get current scores.