Skip to main content

Mock Location Data

This guide explains how to properly generate mock location data for testing and development purposes, following the system's data flow architecture.

Data Flow Architecture

The location tracking system uses a TimescaleDB-based architecture with the following components:

Components

  1. Raw Data Collection: location_reports table stores the raw location data with the following key fields:

    • hashed_adv_key: Links to a tracker's hashed_advertisement_key
    • timestamp: When the location was recorded
    • location: PostGIS geometry point (longitude, latitude)
    • confidence: Accuracy confidence level (0-100)
    • nearest_city: Optional geocoded city name
  2. TimescaleDB Continuous Aggregates:

    • location_history_hourly: Aggregates location data in 1-hour buckets
    • location_history_daily: Aggregates location data in 1-day buckets
  3. Materialized View:

    • location_history: A materialized view that combines data from the hourly and daily aggregates
    • Automatically refreshed through triggers when location_reports is updated

Mock Data Scripts

The proper way to generate mock location data is to insert it into the location_reports table and let the TimescaleDB continuous aggregates and triggers handle the population of the location_history materialized view.

For Trackers Without History

We've created a script sql/mock_location_reports_europe.sql that:

  1. Creates a set of European city coordinates
  2. Finds trackers that don't have any location history
  3. Assigns random European locations to these trackers
  4. Inserts the data into location_reports with the proper fields
  5. Includes verification queries to check both the location_reports and location_history tables

Example usage:

-- Run the script to insert mock data for trackers without history
\i sql/mock_location_reports_europe.sql

-- After a short delay, verify that the data appears in location_history
SELECT
t.name as tracker_name,
lh.timestamp,
ST_X(lh.location::geometry) as longitude,
ST_Y(lh.location::geometry) as latitude,
lh.nearest_city
FROM location_history lh
JOIN trackers t ON t.id = lh.tracker_id
ORDER BY lh.timestamp DESC
LIMIT 5;

For All Trackers

If you want to generate mock data for all trackers, regardless of whether they already have location history, you can use the sql/mock_location_reports_europe_all_trackers.sql script:

-- Run the script to insert mock data for all trackers
\i sql/mock_location_reports_europe_all_trackers.sql

-- After a short delay, verify that the data appears in location_history
SELECT
t.name as tracker_name,
lh.timestamp,
ST_X(lh.location::geometry) as longitude,
ST_Y(lh.location::geometry) as latitude,
lh.nearest_city
FROM location_history lh
JOIN trackers t ON t.id = lh.tracker_id
ORDER BY lh.timestamp DESC
LIMIT 10;

This script works similarly to the one for trackers without history, but it doesn't filter trackers based on existing location history.

For All Trackers with Movement History

For more realistic testing, especially for features like status changes and geofence detection, you can use the sql/mock_location_reports_europe_with_history.sql script:

-- Run the script to insert mock data with movement history
\i sql/mock_location_reports_europe_with_history.sql

-- After a short delay, verify that the data appears in location_history
SELECT
t.name as tracker_name,
lh.timestamp,
ST_X(lh.location::geometry) as longitude,
ST_Y(lh.location::geometry) as latitude,
lh.nearest_city
FROM location_history lh
JOIN trackers t ON t.id = lh.tracker_id
ORDER BY t.name, lh.timestamp DESC
LIMIT 20;

This enhanced script:

  1. Creates a final destination for each tracker (a European city)
  2. Generates 12 hours of movement history leading up to that destination
  3. Each position is approximately 2km away from the next one
  4. Only the final position has the nearest_city populated, simulating the need for geocoding

After running this script, you can trigger the geofence processing to update tracker statuses:

-- Manually refresh the continuous aggregates and materialized view
CALL refresh_continuous_aggregate('location_history_hourly',
(NOW() - INTERVAL '1 day')::timestamp without time zone,
(NOW() + INTERVAL '1 hour')::timestamp without time zone);
CALL refresh_continuous_aggregate('location_history_daily',
(NOW() - INTERVAL '2 days')::timestamp without time zone,
(NOW() + INTERVAL '1 day')::timestamp without time zone);
REFRESH MATERIALIZED VIEW CONCURRENTLY location_history;

-- Trigger status updates based on the new location data
SELECT match_geofences();

Incorrect Approach: Direct Insert into location_history

The script sql/mock_location_history_europe.sql directly inserts into the location_history materialized view, bypassing the proper data flow. This approach is not recommended because:

  1. It bypasses the TimescaleDB continuous aggregates
  2. It may cause inconsistencies when the materialized view is refreshed
  3. It doesn't populate the source location_reports table

Best Practices

Always insert location data into the location_reports table and let the TimescaleDB continuous aggregates and triggers handle the population of the location_history materialized view. This ensures:

  1. Data consistency across the system
  2. Proper time-series aggregation
  3. Efficient data retention policies
  4. Automatic view refreshing

Relationship with Geocoding

When inserting mock data with the nearest_city field already populated, you're simulating data that has already gone through the geocoding process. In a real-world scenario:

  1. Raw location reports are inserted without nearest_city
  2. The geocoding process updates the nearest_city field in location_reports
  3. This update triggers a refresh of the continuous aggregates
  4. The materialized view is refreshed with the updated data

For testing the geocoding process itself, you should insert data without the nearest_city field and then trigger the geocoding process separately.