Skip to main content

Geofence Processing Update

Issue with Confidence Column

During testing, we discovered an issue with the match_geofences() function. The function references a confidence column in the location_history materialized view, but this column doesn't exist in the view.

The location_history materialized view is populated from location_history_hourly and location_history_daily tables, which are TimescaleDB continuous aggregates. The location_history_hourly continuous aggregate joins the location_reports table with the trackers table on the hashed_advertisement_key, and it filters out location reports with confidence > 100.

However, the confidence column itself is not included in the materialized view, which causes an error when the match_geofences() function tries to reference it.

Solution

We've updated the match_geofences() function to remove the references to the confidence column. Since the continuous aggregate already filters out location reports with confidence > 100, we don't need to check this condition again in the match_geofences() function.

Here's the updated function:

CREATE OR REPLACE FUNCTION match_geofences() RETURNS void AS $$
DECLARE
v_tracker_id INTEGER;
v_location_id INTEGER;
v_timestamp TIMESTAMP;
v_production_run_id INTEGER;
v_time_bucket TIMESTAMP;
v_storage_location_id INTEGER;
v_delivery_location_id INTEGER;
v_status_id INTEGER;
v_status TEXT;
v_time_in_geofence INTERVAL;
v_threshold INTERVAL := INTERVAL '12 hours';
v_current_status TEXT;
v_current_state_start TIMESTAMP;
v_user_id INTEGER;
BEGIN
-- Create a temporary table to track which location history rows have been processed
CREATE TEMP TABLE IF NOT EXISTS location_processing_status (
location_id INTEGER PRIMARY KEY,
processed BOOLEAN DEFAULT FALSE
);

-- Get all trackers with location history in the last 24 hours
FOR v_tracker_id IN
SELECT DISTINCT tracker_id
FROM location_history
WHERE timestamp > NOW() - INTERVAL '24 hours'
LOOP
-- Get the production run ID for this tracker
SELECT production_run_id INTO v_production_run_id
FROM trackers
WHERE id = v_tracker_id;

-- Get the current status of the tracker
SELECT current_status, current_state_start
INTO v_current_status, v_current_state_start
FROM trackers
WHERE id = v_tracker_id;

-- Process each hour of data
FOR v_time_bucket IN
SELECT date_trunc('hour', timestamp) as hour
FROM location_history
WHERE tracker_id = v_tracker_id
AND timestamp > NOW() - INTERVAL '24 hours'
GROUP BY hour
ORDER BY hour
LOOP
-- Find the location with the longest time in a storage location geofence
SELECT
lh.id,
sl.id,
lh.timestamp,
(
SELECT MAX(timestamp) - MIN(timestamp)
FROM location_history lh2
WHERE lh2.tracker_id = lh.tracker_id
AND ST_DWithin(
lh2.location,
sl.location,
200 -- 200 meters radius
)
AND lh2.timestamp <= lh.timestamp
AND lh2.timestamp >= lh.timestamp - INTERVAL '24 hours'
) AS time_in_geofence
INTO v_location_id, v_storage_location_id, v_timestamp, v_time_in_geofence
FROM location_history lh
JOIN storage_locations sl ON ST_DWithin(lh.location, sl.location, 200)
WHERE lh.tracker_id = v_tracker_id
AND lh.timestamp >= v_time_bucket
AND lh.timestamp < v_time_bucket + INTERVAL '1 hour'
AND (sl.production_run_id = v_production_run_id OR sl.production_run_id IS NULL)
ORDER BY time_in_geofence DESC
LIMIT 1;

-- If we found a storage location and the time in geofence is >= threshold
IF v_storage_location_id IS NOT NULL AND v_time_in_geofence >= v_threshold THEN
-- Mark this location as processed
INSERT INTO location_processing_status (location_id, processed)
VALUES (v_location_id, TRUE)
ON CONFLICT (location_id) DO UPDATE SET processed = TRUE;

-- If the current status is not 'in_storage' or the storage location has changed
IF v_current_status IS DISTINCT FROM 'in_storage' OR v_current_state_start IS NULL THEN
-- Update the tracker status
UPDATE trackers
SET current_status = 'in_storage',
current_state_start = v_timestamp
WHERE id = v_tracker_id;

-- Add a status history entry
INSERT INTO status_history (
tracker_id,
timestamp,
status,
storage_location_id
) VALUES (
v_tracker_id,
v_timestamp,
'in_storage',
v_storage_location_id
) RETURNING id INTO v_status_id;

-- Notify users about the status change
FOR v_user_id IN
SELECT u.id
FROM users u
JOIN user_clients uc ON u.id = uc.user_id
JOIN trackers t ON uc.client_id = t.client_id
WHERE t.id = v_tracker_id
LOOP
INSERT INTO notifications (
user_id,
tracker_id,
status_history_id,
message,
status,
read,
created_at
) VALUES (
v_user_id,
v_tracker_id,
v_status_id,
'Tracker ' || (SELECT name FROM trackers WHERE id = v_tracker_id) || ' is now in storage at ' || (SELECT name FROM storage_locations WHERE id = v_storage_location_id),
'in_storage',
FALSE,
NOW()
);
END LOOP;
END IF;

-- Skip checking delivery locations if we already found a storage location
CONTINUE;
END IF;

-- Find the location with the longest time in a delivery location geofence
SELECT
lh.id,
dl.id,
lh.timestamp,
(
SELECT MAX(timestamp) - MIN(timestamp)
FROM location_history lh2
WHERE lh2.tracker_id = lh.tracker_id
AND ST_DWithin(
lh2.location,
dl.location,
200 -- 200 meters radius
)
AND lh2.timestamp <= lh.timestamp
AND lh2.timestamp >= lh.timestamp - INTERVAL '24 hours'
) AS time_in_geofence
INTO v_location_id, v_delivery_location_id, v_timestamp, v_time_in_geofence
FROM location_history lh
JOIN delivery_locations dl ON ST_DWithin(lh.location, dl.location, 200)
WHERE lh.tracker_id = v_tracker_id
AND lh.timestamp >= v_time_bucket
AND lh.timestamp < v_time_bucket + INTERVAL '1 hour'
AND (dl.production_run_id = v_production_run_id OR dl.production_run_id IS NULL)
ORDER BY time_in_geofence DESC
LIMIT 1;

-- If we found a delivery location and the time in geofence is >= threshold
IF v_delivery_location_id IS NOT NULL AND v_time_in_geofence >= v_threshold THEN
-- Mark this location as processed
INSERT INTO location_processing_status (location_id, processed)
VALUES (v_location_id, TRUE)
ON CONFLICT (location_id) DO UPDATE SET processed = TRUE;

-- If the current status is not 'delivered' or the delivery location has changed
IF v_current_status IS DISTINCT FROM 'delivered' OR v_current_state_start IS NULL THEN
-- Update the tracker status
UPDATE trackers
SET current_status = 'delivered',
current_state_start = v_timestamp
WHERE id = v_tracker_id;

-- Add a status history entry
INSERT INTO status_history (
tracker_id,
timestamp,
status,
delivery_location_id
) VALUES (
v_tracker_id,
v_timestamp,
'delivered',
v_delivery_location_id
) RETURNING id INTO v_status_id;

-- Notify users about the status change
FOR v_user_id IN
SELECT u.id
FROM users u
JOIN user_clients uc ON u.id = uc.user_id
JOIN trackers t ON uc.client_id = t.client_id
WHERE t.id = v_tracker_id
LOOP
INSERT INTO notifications (
user_id,
tracker_id,
status_history_id,
message,
status,
read,
created_at
) VALUES (
v_user_id,
v_tracker_id,
v_status_id,
'Tracker ' || (SELECT name FROM trackers WHERE id = v_tracker_id) || ' has been delivered to ' || (SELECT name FROM delivery_locations WHERE id = v_delivery_location_id),
'delivered',
FALSE,
NOW()
);
END LOOP;
END IF;
END IF;

-- Reset variables for the next iteration
v_storage_location_id := NULL;
v_delivery_location_id := NULL;
v_time_in_geofence := NULL;
END LOOP;
END LOOP;
END;
$$ LANGUAGE plpgsql;

Testing

We've created several test scripts to verify the geofence processing system:

  1. sql/fix_match_geofences.sql: Contains the updated match_geofences() function.
  2. sql/simple_test.sql: A simple test script that creates a few location reports.
  3. sql/comprehensive_test.sql: A more comprehensive test script that creates test storage and delivery locations, and location reports for a tracker.
  4. sql/trigger_status_change.sql: A test script that creates location reports that should trigger a status change.

To run the tests, use the following commands:

# Update the match_geofences() function
psql -U postgres -d tracker -f sql/fix_match_geofences.sql

# Run the simple test
psql -U postgres -d tracker -f sql/simple_test.sql

# Run the comprehensive test
psql -U postgres -d tracker -f sql/comprehensive_test.sql

# Run the trigger status change test
psql -U postgres -d tracker -f sql/trigger_status_change.sql

Materialized View Structure

The location_history materialized view is populated from location_history_hourly and location_history_daily tables, which are TimescaleDB continuous aggregates. The location_history_hourly continuous aggregate joins the location_reports table with the trackers table on the hashed_advertisement_key, and it filters out location reports with confidence > 100.

Here's the structure of the location_history materialized view:

SELECT * FROM location_history LIMIT 1;
   id    |         timestamp          |                      location                      | nearest_city | tracker_id
---------+----------------------------+----------------------------------------------------+-------------+-----------
1 | 2025-02-26 14:00:00 | 0101000020E61000002A1F82AAD1CBF7BFF03A76AB42394A40 | | 1

The location_history materialized view has the following columns:

  • id: A unique identifier for the location history record.
  • timestamp: The timestamp of the location report.
  • location: The PostGIS geometry representing the location.
  • nearest_city: The nearest city to the location (may be NULL).
  • tracker_id: The ID of the tracker.

Note that the confidence column from the location_reports table is not included in the materialized view.

Continuous Aggregate Definition

The location_history_hourly continuous aggregate is defined as follows:

SELECT time_bucket('01:00:00'::interval, lr."timestamp") AS bucket,
tr.id AS tracker_id,
st_centroid(st_collect(lr.location)) AS location,
(array_agg(lr."timestamp" ORDER BY lr."timestamp" DESC))[1] AS latest_timestamp,
(array_agg(lr.nearest_city ORDER BY lr."timestamp" DESC))[1] AS nearest_city
FROM location_reports lr
JOIN trackers tr ON lr.hashed_adv_key = tr.hashed_advertisement_key
WHERE lr.confidence <= 100
GROUP BY time_bucket('01:00:00'::interval, lr."timestamp"), tr.id;

This continuous aggregate:

  1. Groups location reports by hour and tracker ID.
  2. Calculates the centroid of all locations in each hour.
  3. Gets the latest timestamp and nearest city in each hour.
  4. Filters out location reports with confidence > 100.

The location_history_daily continuous aggregate is defined as follows:

SELECT time_bucket('1 day'::interval, bucket) AS day,
tracker_id,
st_centroid(st_collect(location)) AS location,
max(latest_timestamp) AS latest_timestamp,
(array_agg(nearest_city ORDER BY bucket DESC))[1] AS nearest_city
FROM location_history_hourly
GROUP BY time_bucket('1 day'::interval, bucket), tracker_id;

This continuous aggregate:

  1. Groups hourly data by day and tracker ID.
  2. Calculates the centroid of all locations in each day.
  3. Gets the latest timestamp and nearest city in each day.

Conclusion

By removing the references to the confidence column in the match_geofences() function, we've fixed the issue with the geofence processing system. The continuous aggregate already filters out location reports with confidence > 100, so we don't need to check this condition again in the match_geofences() function.