Data Enrichment - Alana Shopping B2B

Overview

Data enrichment is the process of taking raw product records and progressively improving their quality through the pipeline stages. This guide covers the full enrichment flow — from initial import through Bronze, Silver normalization, and Gold scoring — plus advanced enrichment via Bright Data web scraping.

Step 1 — Import (Bronze)

The first step is getting your data into the system. All imports land in the Bronze layer: raw, unmodified, and idempotent.

Import methods

Method	Best for
CSV/Excel upload	Existing supplier spreadsheets
URL import	Scraping product pages directly from the web
API (`POST /api/workspace/{workspaceId}/catalogs/{catalogId}/products`)	Programmatic feeds
MCP feed	Real-time catalog sync from external systems

After import, products are visible in your catalog with pipeline_stage: "bronze".

Step 2 — Silver (Normalize)

Silver cleans and normalizes your raw data: standardizes casing, validates image URLs, detects duplicates, and maps fields to the Alana schema.

Run Silver via UI

Open your catalog
Select the products you want to normalize (or use Select All)
Click Batch Actions → Normalize
A progress indicator shows normalized / total
When complete, review the results panel: fields mapped, duplicates found, broken URLs

Run Silver via API

curl -X POST "https://app.alana.shopping/api/workspace/WORKSPACE_ID/catalogs/CATALOG_ID/batch/silver" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"scope": "all"}'

Silver results

After Silver runs, each product record contains:

fieldsNormalized — count of fields that were transformed
duplicateOf — product ID if a duplicate was detected
urlsValidated — count of image/media URLs checked
pipeline_stage: "silver"

Step 3 — Gold (Score & Analyze)

Gold produces an optimization score (0–100) and a gap list — the fields that, if filled, would most increase the score.

Run Gold via UI

Select products in the catalog
Click Batch Actions → Analyze
A progress indicator shows analyzed / total
When complete, each product shows its score badge and gap highlights

Run Gold via API

curl -X POST "https://app.alana.shopping/api/workspace/WORKSPACE_ID/catalogs/CATALOG_ID/batch/gold" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"scope": "all"}'

Step 4 — Review scores in Canvas

After Gold, open the Canvas to review and act on enrichment results:

Sort products by score (ascending) to find the lowest-quality items
For each product, the Gaps panel shows exactly which fields are missing
Use the inline editor to fill gaps directly in Canvas
Re-run Gold on edited products to update the score

Step 5 — Bright Data enrichment

For products with incomplete data, Alana integrates with Bright Data to enrich via web scraping, SERP analysis, and dataset presets.

URL import with web scraping

When you import via a product URL, Bright Data’s Web Scraper extracts structured data directly from the product page.

curl -X POST "https://app.alana.shopping/api/workspace/WORKSPACE_ID/url-import" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/product/blue-running-shoes",
    "catalogId": "CATALOG_ID",
    "method": "web_scraper"
  }'

SERP analysis for SEO

Use Bright Data’s SERP API to discover high-value keywords for your products. Results are used to optimize titles, descriptions, and meta fields during Gold scoring.

curl -X POST "https://app.alana.shopping/api/workspace/WORKSPACE_ID/catalogs/CATALOG_ID/enrich/serp" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "productIds": ["prod_123", "prod_456"],
    "locale": "en-US"
  }'

Dataset presets for competitive analysis

Bright Data dataset presets pull structured competitor data for categories like electronics, apparel, and home goods — useful for benchmarking your catalog against market leaders. Available presets:

electronics_specs — technical specifications from major retailers
apparel_sizing — size charts and fit data
grocery_nutrition — nutritional information and ingredients
home_goods_dimensions — physical dimensions and materials

Full enrichment flow

Best practices

Always run Silver before Gold

Gold scoring relies on normalized data. Running Gold on raw Bronze data produces artificially low scores because fields like brand and category are not linked yet.

Use SERP enrichment for top-priority products

SERP analysis has a cost per query. Run it on your highest-traffic SKUs or new product launches, not bulk catalogs.

Prioritize gaps by score impact

The gaps list is ordered by impact. Filling the first gap will increase the score more than filling the last. Focus on the top 2–3 gaps per product.

Re-run Gold after every edit session

Scores are not live — they reflect the last time Gold ran. Re-run Gold after filling gaps to get current scores.

​Overview

​Step 1 — Import (Bronze)

​Import methods

​Step 2 — Silver (Normalize)

​Run Silver via UI

​Run Silver via API

​Silver results

​Step 3 — Gold (Score & Analyze)

​Run Gold via UI

​Run Gold via API

​Step 4 — Review scores in Canvas

​Step 5 — Bright Data enrichment

​URL import with web scraping

​SERP analysis for SEO

​Dataset presets for competitive analysis

​Full enrichment flow

​Best practices