Skip to main content

Overview

Data enrichment is the process of taking raw product records and progressively improving their quality through the pipeline stages. This guide covers the full enrichment flow — from initial import through Bronze, Silver normalization, and Gold scoring — plus advanced enrichment via Bright Data web scraping.

Step 1 — Import (Bronze)

The first step is getting your data into the system. All imports land in the Bronze layer: raw, unmodified, and idempotent.

Import methods

MethodBest for
CSV/Excel uploadExisting supplier spreadsheets
URL importScraping product pages directly from the web
API (POST /api/workspace/{workspaceId}/catalogs/{catalogId}/products)Programmatic feeds
MCP feedReal-time catalog sync from external systems
After import, products are visible in your catalog with pipeline_stage: "bronze".

Step 2 — Silver (Normalize)

Silver cleans and normalizes your raw data: standardizes casing, validates image URLs, detects duplicates, and maps fields to the Alana schema.

Run Silver via UI

  1. Open your catalog
  2. Select the products you want to normalize (or use Select All)
  3. Click Batch ActionsNormalize
  4. A progress indicator shows normalized / total
  5. When complete, review the results panel: fields mapped, duplicates found, broken URLs

Run Silver via API

curl -X POST "https://app.alana.shopping/api/workspace/WORKSPACE_ID/catalogs/CATALOG_ID/batch/silver" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"scope": "all"}'

Silver results

After Silver runs, each product record contains:
  • fieldsNormalized — count of fields that were transformed
  • duplicateOf — product ID if a duplicate was detected
  • urlsValidated — count of image/media URLs checked
  • pipeline_stage: "silver"

Step 3 — Gold (Score & Analyze)

Gold produces an optimization score (0–100) and a gap list — the fields that, if filled, would most increase the score.

Run Gold via UI

  1. Select products in the catalog
  2. Click Batch ActionsAnalyze
  3. A progress indicator shows analyzed / total
  4. When complete, each product shows its score badge and gap highlights

Run Gold via API

curl -X POST "https://app.alana.shopping/api/workspace/WORKSPACE_ID/catalogs/CATALOG_ID/batch/gold" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"scope": "all"}'

Step 4 — Review scores in Canvas

After Gold, open the Canvas to review and act on enrichment results:
  1. Sort products by score (ascending) to find the lowest-quality items
  2. For each product, the Gaps panel shows exactly which fields are missing
  3. Use the inline editor to fill gaps directly in Canvas
  4. Re-run Gold on edited products to update the score

Step 5 — Bright Data enrichment

For products with incomplete data, Alana integrates with Bright Data to enrich via web scraping, SERP analysis, and dataset presets.

URL import with web scraping

When you import via a product URL, Bright Data’s Web Scraper extracts structured data directly from the product page.
curl -X POST "https://app.alana.shopping/api/workspace/WORKSPACE_ID/url-import" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/product/blue-running-shoes",
    "catalogId": "CATALOG_ID",
    "method": "web_scraper"
  }'

SERP analysis for SEO

Use Bright Data’s SERP API to discover high-value keywords for your products. Results are used to optimize titles, descriptions, and meta fields during Gold scoring.
curl -X POST "https://app.alana.shopping/api/workspace/WORKSPACE_ID/catalogs/CATALOG_ID/enrich/serp" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "productIds": ["prod_123", "prod_456"],
    "locale": "en-US"
  }'

Dataset presets for competitive analysis

Bright Data dataset presets pull structured competitor data for categories like electronics, apparel, and home goods — useful for benchmarking your catalog against market leaders. Available presets:
  • electronics_specs — technical specifications from major retailers
  • apparel_sizing — size charts and fit data
  • grocery_nutrition — nutritional information and ingredients
  • home_goods_dimensions — physical dimensions and materials

Full enrichment flow


Best practices

Gold scoring relies on normalized data. Running Gold on raw Bronze data produces artificially low scores because fields like brand and category are not linked yet.
SERP analysis has a cost per query. Run it on your highest-traffic SKUs or new product launches, not bulk catalogs.
The gaps list is ordered by impact. Filling the first gap will increase the score more than filling the last. Focus on the top 2–3 gaps per product.
Scores are not live — they reflect the last time Gold ran. Re-run Gold after filling gaps to get current scores.
Last modified on March 18, 2026