Mock Location Data

This guide explains how to properly generate mock location data for testing and development purposes, following the system's data flow architecture.

Data Flow Architecture

The location tracking system uses a TimescaleDB-based architecture with the following components:

Components

Raw Data Collection: location_reports table stores the raw location data with the following key fields:
- hashed_adv_key: Links to a tracker's hashed_advertisement_key
- timestamp: When the location was recorded
- location: PostGIS geometry point (longitude, latitude)
- confidence: Accuracy confidence level (0-100)
- nearest_city: Optional geocoded city name
TimescaleDB Continuous Aggregates:
- location_history_hourly: Aggregates location data in 1-hour buckets
- location_history_daily: Aggregates location data in 1-day buckets
Materialized View:
- location_history: A materialized view that combines data from the hourly and daily aggregates
- Automatically refreshed through triggers when location_reports is updated

Mock Data Scripts

Recommended Approach: Insert into `location_reports`

The proper way to generate mock location data is to insert it into the location_reports table and let the TimescaleDB continuous aggregates and triggers handle the population of the location_history materialized view.

For Trackers Without History

We've created a script sql/mock_location_reports_europe.sql that:

Creates a set of European city coordinates
Finds trackers that don't have any location history
Assigns random European locations to these trackers
Inserts the data into location_reports with the proper fields
Includes verification queries to check both the location_reports and location_history tables

Example usage:

-- Run the script to insert mock data for trackers without history
\i sql/mock_location_reports_europe.sql

-- After a short delay, verify that the data appears in location_history
SELECT
    t.name as tracker_name,
    lh.timestamp,
    ST_X(lh.location::geometry) as longitude,
    ST_Y(lh.location::geometry) as latitude,
    lh.nearest_city
FROM location_history lh
JOIN trackers t ON t.id = lh.tracker_id
ORDER BY lh.timestamp DESC
LIMIT 5;

For All Trackers

If you want to generate mock data for all trackers, regardless of whether they already have location history, you can use the sql/mock_location_reports_europe_all_trackers.sql script:

-- Run the script to insert mock data for all trackers
\i sql/mock_location_reports_europe_all_trackers.sql

-- After a short delay, verify that the data appears in location_history
SELECT
    t.name as tracker_name,
    lh.timestamp,
    ST_X(lh.location::geometry) as longitude,
    ST_Y(lh.location::geometry) as latitude,
    lh.nearest_city
FROM location_history lh
JOIN trackers t ON t.id = lh.tracker_id
ORDER BY lh.timestamp DESC
LIMIT 10;

This script works similarly to the one for trackers without history, but it doesn't filter trackers based on existing location history.

For All Trackers with Movement History

For more realistic testing, especially for features like status changes and geofence detection, you can use the sql/mock_location_reports_europe_with_history.sql script:

-- Run the script to insert mock data with movement history
\i sql/mock_location_reports_europe_with_history.sql

-- After a short delay, verify that the data appears in location_history
SELECT
    t.name as tracker_name,
    lh.timestamp,
    ST_X(lh.location::geometry) as longitude,
    ST_Y(lh.location::geometry) as latitude,
    lh.nearest_city
FROM location_history lh
JOIN trackers t ON t.id = lh.tracker_id
ORDER BY t.name, lh.timestamp DESC
LIMIT 20;

This enhanced script:

Creates a final destination for each tracker (a European city)
Generates 12 hours of movement history leading up to that destination
Each position is approximately 2km away from the next one
Only the final position has the nearest_city populated, simulating the need for geocoding

After running this script, you can trigger the geofence processing to update tracker statuses:

-- Manually refresh the continuous aggregates and materialized view
CALL refresh_continuous_aggregate('location_history_hourly',
    (NOW() - INTERVAL '1 day')::timestamp without time zone,
    (NOW() + INTERVAL '1 hour')::timestamp without time zone);
CALL refresh_continuous_aggregate('location_history_daily',
    (NOW() - INTERVAL '2 days')::timestamp without time zone,
    (NOW() + INTERVAL '1 day')::timestamp without time zone);
REFRESH MATERIALIZED VIEW CONCURRENTLY location_history;

-- Trigger status updates based on the new location data
SELECT match_geofences();

Incorrect Approach: Direct Insert into `location_history`

The script sql/mock_location_history_europe.sql directly inserts into the location_history materialized view, bypassing the proper data flow. This approach is not recommended because:

It bypasses the TimescaleDB continuous aggregates
It may cause inconsistencies when the materialized view is refreshed
It doesn't populate the source location_reports table

Best Practices

Always insert location data into the location_reports table and let the TimescaleDB continuous aggregates and triggers handle the population of the location_history materialized view. This ensures:

Data consistency across the system
Proper time-series aggregation
Efficient data retention policies
Automatic view refreshing

Relationship with Geocoding

When inserting mock data with the nearest_city field already populated, you're simulating data that has already gone through the geocoding process. In a real-world scenario:

Raw location reports are inserted without nearest_city
The geocoding process updates the nearest_city field in location_reports
This update triggers a refresh of the continuous aggregates
The materialized view is refreshed with the updated data

For testing the geocoding process itself, you should insert data without the nearest_city field and then trigger the geocoding process separately.

Data Flow Architecture​

Components​

Mock Data Scripts​

Recommended Approach: Insert into location_reports​

For Trackers Without History​

For All Trackers​

For All Trackers with Movement History​

Incorrect Approach: Direct Insert into location_history​

Best Practices​

Relationship with Geocoding​