Geofence Processing Update
Issue with Confidence Column
During testing, we discovered an issue with the match_geofences() function. The function references a confidence column in the location_history materialized view, but this column doesn't exist in the view.
The location_history materialized view is populated from location_history_hourly and location_history_daily tables, which are TimescaleDB continuous aggregates. The location_history_hourly continuous aggregate joins the location_reports table with the trackers table on the hashed_advertisement_key, and it filters out location reports with confidence > 100.
However, the confidence column itself is not included in the materialized view, which causes an error when the match_geofences() function tries to reference it.
Solution
We've updated the match_geofences() function to remove the references to the confidence column. Since the continuous aggregate already filters out location reports with confidence > 100, we don't need to check this condition again in the match_geofences() function.
Here's the updated function:
CREATE OR REPLACE FUNCTION match_geofences() RETURNS void AS $$
DECLARE
v_tracker_id INTEGER;
v_location_id INTEGER;
v_timestamp TIMESTAMP;
v_production_run_id INTEGER;
v_time_bucket TIMESTAMP;
v_storage_location_id INTEGER;
v_delivery_location_id INTEGER;
v_status_id INTEGER;
v_status TEXT;
v_time_in_geofence INTERVAL;
v_threshold INTERVAL := INTERVAL '12 hours';
v_current_status TEXT;
v_current_state_start TIMESTAMP;
v_user_id INTEGER;
BEGIN
-- Create a temporary table to track which location history rows have been processed
CREATE TEMP TABLE IF NOT EXISTS location_processing_status (
location_id INTEGER PRIMARY KEY,
processed BOOLEAN DEFAULT FALSE
);
-- Get all trackers with location history in the last 24 hours
FOR v_tracker_id IN
SELECT DISTINCT tracker_id
FROM location_history
WHERE timestamp > NOW() - INTERVAL '24 hours'
LOOP
-- Get the production run ID for this tracker
SELECT production_run_id INTO v_production_run_id
FROM trackers
WHERE id = v_tracker_id;
-- Get the current status of the tracker
SELECT current_status, current_state_start
INTO v_current_status, v_current_state_start
FROM trackers
WHERE id = v_tracker_id;
-- Process each hour of data
FOR v_time_bucket IN
SELECT date_trunc('hour', timestamp) as hour
FROM location_history
WHERE tracker_id = v_tracker_id
AND timestamp > NOW() - INTERVAL '24 hours'
GROUP BY hour
ORDER BY hour
LOOP
-- Find the location with the longest time in a storage location geofence
SELECT
lh.id,
sl.id,
lh.timestamp,
(
SELECT MAX(timestamp) - MIN(timestamp)
FROM location_history lh2
WHERE lh2.tracker_id = lh.tracker_id
AND ST_DWithin(
lh2.location,
sl.location,
200 -- 200 meters radius
)
AND lh2.timestamp <= lh.timestamp
AND lh2.timestamp >= lh.timestamp - INTERVAL '24 hours'
) AS time_in_geofence
INTO v_location_id, v_storage_location_id, v_timestamp, v_time_in_geofence
FROM location_history lh
JOIN storage_locations sl ON ST_DWithin(lh.location, sl.location, 200)
WHERE lh.tracker_id = v_tracker_id
AND lh.timestamp >= v_time_bucket
AND lh.timestamp < v_time_bucket + INTERVAL '1 hour'
AND (sl.production_run_id = v_production_run_id OR sl.production_run_id IS NULL)
ORDER BY time_in_geofence DESC
LIMIT 1;
-- If we found a storage location and the time in geofence is >= threshold
IF v_storage_location_id IS NOT NULL AND v_time_in_geofence >= v_threshold THEN
-- Mark this location as processed
INSERT INTO location_processing_status (location_id, processed)
VALUES (v_location_id, TRUE)
ON CONFLICT (location_id) DO UPDATE SET processed = TRUE;
-- If the current status is not 'in_storage' or the storage location has changed
IF v_current_status IS DISTINCT FROM 'in_storage' OR v_current_state_start IS NULL THEN
-- Update the tracker status
UPDATE trackers
SET current_status = 'in_storage',
current_state_start = v_timestamp
WHERE id = v_tracker_id;
-- Add a status history entry
INSERT INTO status_history (
tracker_id,
timestamp,
status,
storage_location_id
) VALUES (
v_tracker_id,
v_timestamp,
'in_storage',
v_storage_location_id
) RETURNING id INTO v_status_id;
-- Notify users about the status change
FOR v_user_id IN
SELECT u.id
FROM users u
JOIN user_clients uc ON u.id = uc.user_id
JOIN trackers t ON uc.client_id = t.client_id
WHERE t.id = v_tracker_id
LOOP
INSERT INTO notifications (
user_id,
tracker_id,
status_history_id,
message,
status,
read,
created_at
) VALUES (
v_user_id,
v_tracker_id,
v_status_id,
'Tracker ' || (SELECT name FROM trackers WHERE id = v_tracker_id) || ' is now in storage at ' || (SELECT name FROM storage_locations WHERE id = v_storage_location_id),
'in_storage',
FALSE,
NOW()
);
END LOOP;
END IF;
-- Skip checking delivery locations if we already found a storage location
CONTINUE;
END IF;
-- Find the location with the longest time in a delivery location geofence
SELECT
lh.id,
dl.id,
lh.timestamp,
(
SELECT MAX(timestamp) - MIN(timestamp)
FROM location_history lh2
WHERE lh2.tracker_id = lh.tracker_id
AND ST_DWithin(
lh2.location,
dl.location,
200 -- 200 meters radius
)
AND lh2.timestamp <= lh.timestamp
AND lh2.timestamp >= lh.timestamp - INTERVAL '24 hours'
) AS time_in_geofence
INTO v_location_id, v_delivery_location_id, v_timestamp, v_time_in_geofence
FROM location_history lh
JOIN delivery_locations dl ON ST_DWithin(lh.location, dl.location, 200)
WHERE lh.tracker_id = v_tracker_id
AND lh.timestamp >= v_time_bucket
AND lh.timestamp < v_time_bucket + INTERVAL '1 hour'
AND (dl.production_run_id = v_production_run_id OR dl.production_run_id IS NULL)
ORDER BY time_in_geofence DESC
LIMIT 1;
-- If we found a delivery location and the time in geofence is >= threshold
IF v_delivery_location_id IS NOT NULL AND v_time_in_geofence >= v_threshold THEN
-- Mark this location as processed
INSERT INTO location_processing_status (location_id, processed)
VALUES (v_location_id, TRUE)
ON CONFLICT (location_id) DO UPDATE SET processed = TRUE;
-- If the current status is not 'delivered' or the delivery location has changed
IF v_current_status IS DISTINCT FROM 'delivered' OR v_current_state_start IS NULL THEN
-- Update the tracker status
UPDATE trackers
SET current_status = 'delivered',
current_state_start = v_timestamp
WHERE id = v_tracker_id;
-- Add a status history entry
INSERT INTO status_history (
tracker_id,
timestamp,
status,
delivery_location_id
) VALUES (
v_tracker_id,
v_timestamp,
'delivered',
v_delivery_location_id
) RETURNING id INTO v_status_id;
-- Notify users about the status change
FOR v_user_id IN
SELECT u.id
FROM users u
JOIN user_clients uc ON u.id = uc.user_id
JOIN trackers t ON uc.client_id = t.client_id
WHERE t.id = v_tracker_id
LOOP
INSERT INTO notifications (
user_id,
tracker_id,
status_history_id,
message,
status,
read,
created_at
) VALUES (
v_user_id,
v_tracker_id,
v_status_id,
'Tracker ' || (SELECT name FROM trackers WHERE id = v_tracker_id) || ' has been delivered to ' || (SELECT name FROM delivery_locations WHERE id = v_delivery_location_id),
'delivered',
FALSE,
NOW()
);
END LOOP;
END IF;
END IF;
-- Reset variables for the next iteration
v_storage_location_id := NULL;
v_delivery_location_id := NULL;
v_time_in_geofence := NULL;
END LOOP;
END LOOP;
END;
$$ LANGUAGE plpgsql;
Testing
We've created several test scripts to verify the geofence processing system:
sql/fix_match_geofences.sql: Contains the updatedmatch_geofences()function.sql/simple_test.sql: A simple test script that creates a few location reports.sql/comprehensive_test.sql: A more comprehensive test script that creates test storage and delivery locations, and location reports for a tracker.sql/trigger_status_change.sql: A test script that creates location reports that should trigger a status change.
To run the tests, use the following commands:
# Update the match_geofences() function
psql -U postgres -d tracker -f sql/fix_match_geofences.sql
# Run the simple test
psql -U postgres -d tracker -f sql/simple_test.sql
# Run the comprehensive test
psql -U postgres -d tracker -f sql/comprehensive_test.sql
# Run the trigger status change test
psql -U postgres -d tracker -f sql/trigger_status_change.sql
Materialized View Structure
The location_history materialized view is populated from location_history_hourly and location_history_daily tables, which are TimescaleDB continuous aggregates. The location_history_hourly continuous aggregate joins the location_reports table with the trackers table on the hashed_advertisement_key, and it filters out location reports with confidence > 100.
Here's the structure of the location_history materialized view:
SELECT * FROM location_history LIMIT 1;
id | timestamp | location | nearest_city | tracker_id
---------+----------------------------+----------------------------------------------------+-------------+-----------
1 | 2025-02-26 14:00:00 | 0101000020E61000002A1F82AAD1CBF7BFF03A76AB42394A40 | | 1
The location_history materialized view has the following columns:
id: A unique identifier for the location history record.timestamp: The timestamp of the location report.location: The PostGIS geometry representing the location.nearest_city: The nearest city to the location (may be NULL).tracker_id: The ID of the tracker.
Note that the confidence column from the location_reports table is not included in the materialized view.
Continuous Aggregate Definition
The location_history_hourly continuous aggregate is defined as follows:
SELECT time_bucket('01:00:00'::interval, lr."timestamp") AS bucket,
tr.id AS tracker_id,
st_centroid(st_collect(lr.location)) AS location,
(array_agg(lr."timestamp" ORDER BY lr."timestamp" DESC))[1] AS latest_timestamp,
(array_agg(lr.nearest_city ORDER BY lr."timestamp" DESC))[1] AS nearest_city
FROM location_reports lr
JOIN trackers tr ON lr.hashed_adv_key = tr.hashed_advertisement_key
WHERE lr.confidence <= 100
GROUP BY time_bucket('01:00:00'::interval, lr."timestamp"), tr.id;
This continuous aggregate:
- Groups location reports by hour and tracker ID.
- Calculates the centroid of all locations in each hour.
- Gets the latest timestamp and nearest city in each hour.
- Filters out location reports with confidence > 100.
The location_history_daily continuous aggregate is defined as follows:
SELECT time_bucket('1 day'::interval, bucket) AS day,
tracker_id,
st_centroid(st_collect(location)) AS location,
max(latest_timestamp) AS latest_timestamp,
(array_agg(nearest_city ORDER BY bucket DESC))[1] AS nearest_city
FROM location_history_hourly
GROUP BY time_bucket('1 day'::interval, bucket), tracker_id;
This continuous aggregate:
- Groups hourly data by day and tracker ID.
- Calculates the centroid of all locations in each day.
- Gets the latest timestamp and nearest city in each day.
Conclusion
By removing the references to the confidence column in the match_geofences() function, we've fixed the issue with the geofence processing system. The continuous aggregate already filters out location reports with confidence > 100, so we don't need to check this condition again in the match_geofences() function.