groceries/backend/TEST_DATA_README.md

6.4 KiB

Test Data Scripts Documentation

This directory contains scripts for creating and managing test data for the Grocery Tracker application.

Scripts Overview

1. create_test_data.py - Comprehensive Test Data Generator

Creates realistic test data including shops, groceries, and shopping events.

Basic Usage

# Create all test data (default: 30 events over 90 days)
python create_test_data.py

# Create with custom parameters
python create_test_data.py --events 50 --days 120

# Verbose output
python create_test_data.py --verbose

# Dry run (see what would be created without creating it)
python create_test_data.py --dry-run

Command Line Options

Option Description Default
--events N Number of shopping events to create 30
--days N Number of days back to generate events 90
--url URL API base URL http://localhost:8000
--shops-only Create only shops False
--groceries-only Create only groceries False
--events-only Create only shopping events (requires existing data) False
--verbose, -v Verbose output with detailed progress False
--dry-run Show what would be created without creating it False

Examples

# Create only shops
python create_test_data.py --shops-only

# Create only groceries
python create_test_data.py --groceries-only

# Create 100 shopping events using existing shops and groceries
python create_test_data.py --events-only --events 100

# Create test data for the past 6 months with verbose output
python create_test_data.py --events 60 --days 180 --verbose

# Preview what would be created without actually creating it
python create_test_data.py --dry-run

# Use a different API URL
python create_test_data.py --url http://localhost:3000

2. cleanup_test_data.py - Data Cleanup Script

Safely removes all test data with confirmation prompts.

python cleanup_test_data.py

Data Structure

Shops (10 total)

  • Whole Foods Market (San Francisco)
  • Safeway (San Francisco)
  • Trader Joe's (Berkeley)
  • Berkeley Bowl (Berkeley)
  • Rainbow Grocery (San Francisco)
  • Mollie Stone's Market (Palo Alto)
  • Costco Wholesale (San Mateo)
  • Target (Mountain View)
  • Sprouts Farmers Market (Sunnyvale)
  • Lucky Supermarket (San Jose)

Groceries (50+ items across 8 categories)

Category Items Organic Options
Fruits 10 items 5 organic
Vegetables 10 items 5 organic
Dairy 7 items 4 organic
Meat & Seafood 6 items 3 organic
Pantry 10 items 5 organic
Beverages 6 items 3 organic
Frozen 5 items 2 organic
Snacks 5 items 3 organic

Shopping Events

  • Realistic dates: Distributed over specified time period (default: 90 days)
  • Smart quantities: Appropriate amounts based on item type
  • Category-based pricing: Realistic price ranges per category
  • Organic premiums: 20-50% higher prices for organic items
  • Random notes: 30% of events include descriptive notes
  • Varied trip sizes: 2-8 items per shopping trip

Features

Smart Data Generation

  • Realistic pricing: Category-based price ranges with organic premiums
  • Appropriate quantities: Items sold by piece, weight, or volume as appropriate
  • Temporal distribution: Events spread realistically over time
  • Shopping patterns: Varied trip sizes and frequencies

Error Handling

  • Graceful failures: Script continues even if some items fail
  • Network timeouts: Reasonable timeout values for API calls
  • Progress tracking: Clear feedback on creation progress
  • Connection testing: Verifies API availability before starting

Flexible Options

  • Partial creation: Create only specific data types
  • Custom parameters: Adjust event count and date range
  • Dry run mode: Preview without creating data
  • Verbose output: Detailed progress information
  • Custom API URLs: Support for different backend configurations

Troubleshooting

Common Issues

  1. Connection Error

    ❌ Cannot connect to the API server at http://localhost:8000
    

    Solution: Make sure the backend server is running:

    cd backend
    uvicorn main:app --reload
    
  2. Module Not Found Error

    ModuleNotFoundError: No module named 'requests'
    

    Solution: Install dependencies:

    pip install -r requirements.txt
    
  3. Database Connection Error Solution: Ensure PostgreSQL is running and database exists:

    # Check if PostgreSQL is running
    brew services list | grep postgresql
    
    # Start PostgreSQL if needed
    brew services start postgresql
    
  4. Partial Data Creation If some items fail to create, the script will continue and report the final count. Use --verbose to see detailed error messages.

Performance Tips

  • Large datasets: For creating many events (100+), consider running in smaller batches
  • Network issues: Use --verbose to identify specific failures
  • Database performance: Ensure your database has adequate resources for bulk operations

Data Cleanup

To remove all test data:

python cleanup_test_data.py

This script will:

  1. Show what data exists
  2. Ask for confirmation before deletion
  3. Handle foreign key constraints properly
  4. Provide progress feedback

Integration with Application

After running the test data scripts:

  1. Frontend: Refresh the application to see new data
  2. API: All endpoints will return the test data
  3. Database: Data is persisted and will survive server restarts

The test data is designed to showcase all application features:

  • Multiple shops and locations
  • Diverse grocery categories
  • Realistic shopping patterns
  • Price variations and organic options
  • Historical data for analytics

Advanced Usage

Custom Data Sets

You can modify the data arrays in create_test_data.py to create custom test scenarios:

  • Add more shops for specific regions
  • Include specialty grocery categories
  • Adjust price ranges for different markets
  • Create seasonal shopping patterns

Automated Testing

The scripts can be integrated into automated testing workflows:

# Setup test environment
python create_test_data.py --events 10 --days 30

# Run tests
pytest

# Cleanup
python cleanup_test_data.py --force  # (if you add a --force option)