groceries/backend/TEST_DATA_README.md

# Test Data Scripts Documentation

This directory contains scripts for creating and managing test data for the Product Tracker application.

## Scripts Overview

### 1. `create_test_data.py` - Comprehensive Test Data Generator

Creates realistic test data including shops, products, and shopping events.

#### Basic Usage

```bash
# Create all test data (default: 30 events over 90 days)
python create_test_data.py

# Create with custom parameters
python create_test_data.py --events 50 --days 120

# Verbose output
python create_test_data.py --verbose

# Dry run (see what would be created without creating it)
python create_test_data.py --dry-run
```

#### Command Line Options

| Option | Description | Default |
|--------|-------------|---------|
| `--events N` | Number of shopping events to create | 30 |
| `--days N` | Number of days back to generate events | 90 |
| `--url URL` | API base URL | http://localhost:8000 |
| `--shops-only` | Create only shops | False |
| `--products-only` | Create only products | False |
| `--events-only` | Create only shopping events (requires existing data) | False |
| `--verbose`, `-v` | Verbose output with detailed progress | False |
| `--dry-run` | Show what would be created without creating it | False |

#### Examples

```bash
# Create only shops
python create_test_data.py --shops-only

# Create only products
python create_test_data.py --products-only

# Create 100 shopping events using existing shops and products
python create_test_data.py --events-only --events 100

# Create test data for the past 6 months with verbose output
python create_test_data.py --events 60 --days 180 --verbose

# Preview what would be created without actually creating it
python create_test_data.py --dry-run

# Use a different API URL
python create_test_data.py --url http://localhost:3000
```

### 2. `cleanup_test_data.py` - Data Cleanup Script

Safely removes all test data with confirmation prompts.

```bash
python cleanup_test_data.py
```

## Data Structure

### Shops (10 total)
- **Whole Foods Market** (San Francisco)
- **Safeway** (San Francisco)
- **Trader Joe's** (Berkeley)
- **Berkeley Bowl** (Berkeley)
- **Rainbow Product** (San Francisco)
- **Mollie Stone's Market** (Palo Alto)
- **Costco Wholesale** (San Mateo)
- **Target** (Mountain View)
- **Sprouts Farmers Market** (Sunnyvale)
- **Lucky Supermarket** (San Jose)

### Products (50+ items across 8 categories)

| Category | Items | Organic Options |
|----------|-------|-----------------|
| **Fruits** | 10 items | 5 organic |
| **Vegetables** | 10 items | 5 organic |
| **Dairy** | 7 items | 4 organic |
| **Meat & Seafood** | 6 items | 3 organic |
| **Pantry** | 10 items | 5 organic |
| **Beverages** | 6 items | 3 organic |
| **Frozen** | 5 items | 2 organic |
| **Snacks** | 5 items | 3 organic |

### Shopping Events
- **Realistic dates**: Distributed over specified time period (default: 90 days)
- **Smart quantities**: Appropriate amounts based on item type
- **Category-based pricing**: Realistic price ranges per category
- **Organic premiums**: 20-50% higher prices for organic items
- **Random notes**: 30% of events include descriptive notes
- **Varied trip sizes**: 2-8 items per shopping trip

## Features

### Smart Data Generation
- **Realistic pricing**: Category-based price ranges with organic premiums
- **Appropriate quantities**: Items sold by piece, weight, or volume as appropriate
- **Temporal distribution**: Events spread realistically over time
- **Shopping patterns**: Varied trip sizes and frequencies

### Error Handling
- **Graceful failures**: Script continues even if some items fail
- **Network timeouts**: Reasonable timeout values for API calls
- **Progress tracking**: Clear feedback on creation progress
- **Connection testing**: Verifies API availability before starting

### Flexible Options
- **Partial creation**: Create only specific data types
- **Custom parameters**: Adjust event count and date range
- **Dry run mode**: Preview without creating data
- **Verbose output**: Detailed progress information
- **Custom API URLs**: Support for different backend configurations

## Troubleshooting

### Common Issues

1. **Connection Error**
   ```
   ❌ Cannot connect to the API server at http://localhost:8000
   ```
   **Solution**: Make sure the backend server is running:
   ```bash
   cd backend
   uvicorn main:app --reload
   ```

2. **Module Not Found Error**
   ```
   ModuleNotFoundError: No module named 'requests'
   ```
   **Solution**: Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```

3. **Database Connection Error**
   **Solution**: Ensure PostgreSQL is running and database exists:
   ```bash
   # Check if PostgreSQL is running
   brew services list | grep postgresql

   # Start PostgreSQL if needed
   brew services start postgresql
   ```

4. **Partial Data Creation**
   If some items fail to create, the script will continue and report the final count.
   Use `--verbose` to see detailed error messages.

### Performance Tips

- **Large datasets**: For creating many events (100+), consider running in smaller batches
- **Network issues**: Use `--verbose` to identify specific failures
- **Database performance**: Ensure your database has adequate resources for bulk operations

## Data Cleanup

To remove all test data:

```bash
python cleanup_test_data.py
```

This script will:
1. Show what data exists
2. Ask for confirmation before deletion
3. Handle foreign key constraints properly
4. Provide progress feedback

## Integration with Application

After running the test data scripts:

1. **Frontend**: Refresh the application to see new data
2. **API**: All endpoints will return the test data
3. **Database**: Data is persisted and will survive server restarts

The test data is designed to showcase all application features:
- Multiple shops and locations
- Diverse product categories
- Realistic shopping patterns
- Price variations and organic options
- Historical data for analytics

## Advanced Usage

### Custom Data Sets

You can modify the data arrays in `create_test_data.py` to create custom test scenarios:

- Add more shops for specific regions
- Include specialty product categories
- Adjust price ranges for different markets
- Create seasonal shopping patterns

### Automated Testing

The scripts can be integrated into automated testing workflows:

```bash
# Setup test environment
python create_test_data.py --events 10 --days 30

# Run tests
pytest

# Cleanup
python cleanup_test_data.py --force  # (if you add a --force option)
```