Geofence Processing Update

Issue with Confidence Column

During testing, we discovered an issue with the match_geofences() function. The function references a confidence column in the location_history materialized view, but this column doesn't exist in the view.

The location_history materialized view is populated from location_history_hourly and location_history_daily tables, which are TimescaleDB continuous aggregates. The location_history_hourly continuous aggregate joins the location_reports table with the trackers table on the hashed_advertisement_key, and it filters out location reports with confidence > 100.

However, the confidence column itself is not included in the materialized view, which causes an error when the match_geofences() function tries to reference it.

Solution

We've updated the match_geofences() function to remove the references to the confidence column. Since the continuous aggregate already filters out location reports with confidence > 100, we don't need to check this condition again in the match_geofences() function.

Here's the updated function:

CREATE OR REPLACE FUNCTION match_geofences() RETURNS void AS $$
DECLARE
    v_tracker_id INTEGER;
    v_location_id INTEGER;
    v_timestamp TIMESTAMP;
    v_production_run_id INTEGER;
    v_time_bucket TIMESTAMP;
    v_storage_location_id INTEGER;
    v_delivery_location_id INTEGER;
    v_status_id INTEGER;
    v_status TEXT;
    v_time_in_geofence INTERVAL;
    v_threshold INTERVAL := INTERVAL '12 hours';
    v_current_status TEXT;
    v_current_state_start TIMESTAMP;
    v_user_id INTEGER;
BEGIN
    -- Create a temporary table to track which location history rows have been processed
    CREATE TEMP TABLE IF NOT EXISTS location_processing_status (
        location_id INTEGER PRIMARY KEY,
        processed BOOLEAN DEFAULT FALSE
    );

    -- Get all trackers with location history in the last 24 hours
    FOR v_tracker_id IN
        SELECT DISTINCT tracker_id
        FROM location_history
        WHERE timestamp > NOW() - INTERVAL '24 hours'
    LOOP
        -- Get the production run ID for this tracker
        SELECT production_run_id INTO v_production_run_id
        FROM trackers
        WHERE id = v_tracker_id;

        -- Get the current status of the tracker
        SELECT current_status, current_state_start
        INTO v_current_status, v_current_state_start
        FROM trackers
        WHERE id = v_tracker_id;

        -- Process each hour of data
        FOR v_time_bucket IN
            SELECT date_trunc('hour', timestamp) as hour
            FROM location_history
            WHERE tracker_id = v_tracker_id
            AND timestamp > NOW() - INTERVAL '24 hours'
            GROUP BY hour
            ORDER BY hour
        LOOP
            -- Find the location with the longest time in a storage location geofence
            SELECT
                lh.id,
                sl.id,
                lh.timestamp,
                (
                    SELECT MAX(timestamp) - MIN(timestamp)
                    FROM location_history lh2
                    WHERE lh2.tracker_id = lh.tracker_id
                    AND ST_DWithin(
                        lh2.location,
                        sl.location,
                        200 -- 200 meters radius
                    )
                    AND lh2.timestamp <= lh.timestamp
                    AND lh2.timestamp >= lh.timestamp - INTERVAL '24 hours'
                ) AS time_in_geofence
            INTO v_location_id, v_storage_location_id, v_timestamp, v_time_in_geofence
            FROM location_history lh
            JOIN storage_locations sl ON ST_DWithin(lh.location, sl.location, 200)
            WHERE lh.tracker_id = v_tracker_id
            AND lh.timestamp >= v_time_bucket
            AND lh.timestamp < v_time_bucket + INTERVAL '1 hour'
            AND (sl.production_run_id = v_production_run_id OR sl.production_run_id IS NULL)
            ORDER BY time_in_geofence DESC
            LIMIT 1;

            -- If we found a storage location and the time in geofence is >= threshold
            IF v_storage_location_id IS NOT NULL AND v_time_in_geofence >= v_threshold THEN
                -- Mark this location as processed
                INSERT INTO location_processing_status (location_id, processed)
                VALUES (v_location_id, TRUE)
                ON CONFLICT (location_id) DO UPDATE SET processed = TRUE;

                -- If the current status is not 'in_storage' or the storage location has changed
                IF v_current_status IS DISTINCT FROM 'in_storage' OR v_current_state_start IS NULL THEN
                    -- Update the tracker status
                    UPDATE trackers
                    SET current_status = 'in_storage',
                        current_state_start = v_timestamp
                    WHERE id = v_tracker_id;

                    -- Add a status history entry
                    INSERT INTO status_history (
                        tracker_id,
                        timestamp,
                        status,
                        storage_location_id
                    ) VALUES (
                        v_tracker_id,
                        v_timestamp,
                        'in_storage',
                        v_storage_location_id
                    ) RETURNING id INTO v_status_id;

                    -- Notify users about the status change
                    FOR v_user_id IN
                        SELECT u.id
                        FROM users u
                        JOIN user_clients uc ON u.id = uc.user_id
                        JOIN trackers t ON uc.client_id = t.client_id
                        WHERE t.id = v_tracker_id
                    LOOP
                        INSERT INTO notifications (
                            user_id,
                            tracker_id,
                            status_history_id,
                            message,
                            status,
                            read,
                            created_at
                        ) VALUES (
                            v_user_id,
                            v_tracker_id,
                            v_status_id,
                            'Tracker ' || (SELECT name FROM trackers WHERE id = v_tracker_id) || ' is now in storage at ' || (SELECT name FROM storage_locations WHERE id = v_storage_location_id),
                            'in_storage',
                            FALSE,
                            NOW()
                        );
                    END LOOP;
                END IF;

                -- Skip checking delivery locations if we already found a storage location
                CONTINUE;
            END IF;

            -- Find the location with the longest time in a delivery location geofence
            SELECT
                lh.id,
                dl.id,
                lh.timestamp,
                (
                    SELECT MAX(timestamp) - MIN(timestamp)
                    FROM location_history lh2
                    WHERE lh2.tracker_id = lh.tracker_id
                    AND ST_DWithin(
                        lh2.location,
                        dl.location,
                        200 -- 200 meters radius
                    )
                    AND lh2.timestamp <= lh.timestamp
                    AND lh2.timestamp >= lh.timestamp - INTERVAL '24 hours'
                ) AS time_in_geofence
            INTO v_location_id, v_delivery_location_id, v_timestamp, v_time_in_geofence
            FROM location_history lh
            JOIN delivery_locations dl ON ST_DWithin(lh.location, dl.location, 200)
            WHERE lh.tracker_id = v_tracker_id
            AND lh.timestamp >= v_time_bucket
            AND lh.timestamp < v_time_bucket + INTERVAL '1 hour'
            AND (dl.production_run_id = v_production_run_id OR dl.production_run_id IS NULL)
            ORDER BY time_in_geofence DESC
            LIMIT 1;

            -- If we found a delivery location and the time in geofence is >= threshold
            IF v_delivery_location_id IS NOT NULL AND v_time_in_geofence >= v_threshold THEN
                -- Mark this location as processed
                INSERT INTO location_processing_status (location_id, processed)
                VALUES (v_location_id, TRUE)
                ON CONFLICT (location_id) DO UPDATE SET processed = TRUE;

                -- If the current status is not 'delivered' or the delivery location has changed
                IF v_current_status IS DISTINCT FROM 'delivered' OR v_current_state_start IS NULL THEN
                    -- Update the tracker status
                    UPDATE trackers
                    SET current_status = 'delivered',
                        current_state_start = v_timestamp
                    WHERE id = v_tracker_id;

                    -- Add a status history entry
                    INSERT INTO status_history (
                        tracker_id,
                        timestamp,
                        status,
                        delivery_location_id
                    ) VALUES (
                        v_tracker_id,
                        v_timestamp,
                        'delivered',
                        v_delivery_location_id
                    ) RETURNING id INTO v_status_id;

                    -- Notify users about the status change
                    FOR v_user_id IN
                        SELECT u.id
                        FROM users u
                        JOIN user_clients uc ON u.id = uc.user_id
                        JOIN trackers t ON uc.client_id = t.client_id
                        WHERE t.id = v_tracker_id
                    LOOP
                        INSERT INTO notifications (
                            user_id,
                            tracker_id,
                            status_history_id,
                            message,
                            status,
                            read,
                            created_at
                        ) VALUES (
                            v_user_id,
                            v_tracker_id,
                            v_status_id,
                            'Tracker ' || (SELECT name FROM trackers WHERE id = v_tracker_id) || ' has been delivered to ' || (SELECT name FROM delivery_locations WHERE id = v_delivery_location_id),
                            'delivered',
                            FALSE,
                            NOW()
                        );
                    END LOOP;
                END IF;
            END IF;

            -- Reset variables for the next iteration
            v_storage_location_id := NULL;
            v_delivery_location_id := NULL;
            v_time_in_geofence := NULL;
        END LOOP;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

Testing

We've created several test scripts to verify the geofence processing system:

sql/fix_match_geofences.sql: Contains the updated match_geofences() function.
sql/simple_test.sql: A simple test script that creates a few location reports.
sql/comprehensive_test.sql: A more comprehensive test script that creates test storage and delivery locations, and location reports for a tracker.
sql/trigger_status_change.sql: A test script that creates location reports that should trigger a status change.

To run the tests, use the following commands:

# Update the match_geofences() function
psql -U postgres -d tracker -f sql/fix_match_geofences.sql

# Run the simple test
psql -U postgres -d tracker -f sql/simple_test.sql

# Run the comprehensive test
psql -U postgres -d tracker -f sql/comprehensive_test.sql

# Run the trigger status change test
psql -U postgres -d tracker -f sql/trigger_status_change.sql

Materialized View Structure

Here's the structure of the location_history materialized view:

SELECT * FROM location_history LIMIT 1;

   id    |         timestamp          |                      location                      | nearest_city | tracker_id
---------+----------------------------+----------------------------------------------------+-------------+-----------
       1 | 2025-02-26 14:00:00        | 0101000020E61000002A1F82AAD1CBF7BFF03A76AB42394A40 |             |          1

The location_history materialized view has the following columns:

id: A unique identifier for the location history record.
timestamp: The timestamp of the location report.
location: The PostGIS geometry representing the location.
nearest_city: The nearest city to the location (may be NULL).
tracker_id: The ID of the tracker.

Note that the confidence column from the location_reports table is not included in the materialized view.

Continuous Aggregate Definition

The location_history_hourly continuous aggregate is defined as follows:

SELECT time_bucket('01:00:00'::interval, lr."timestamp") AS bucket,
       tr.id AS tracker_id,
       st_centroid(st_collect(lr.location)) AS location,
       (array_agg(lr."timestamp" ORDER BY lr."timestamp" DESC))[1] AS latest_timestamp,
       (array_agg(lr.nearest_city ORDER BY lr."timestamp" DESC))[1] AS nearest_city
FROM location_reports lr
JOIN trackers tr ON lr.hashed_adv_key = tr.hashed_advertisement_key
WHERE lr.confidence <= 100
GROUP BY time_bucket('01:00:00'::interval, lr."timestamp"), tr.id;

This continuous aggregate:

Groups location reports by hour and tracker ID.
Calculates the centroid of all locations in each hour.
Gets the latest timestamp and nearest city in each hour.
Filters out location reports with confidence > 100.

The location_history_daily continuous aggregate is defined as follows:

SELECT time_bucket('1 day'::interval, bucket) AS day,
       tracker_id,
       st_centroid(st_collect(location)) AS location,
       max(latest_timestamp) AS latest_timestamp,
       (array_agg(nearest_city ORDER BY bucket DESC))[1] AS nearest_city
FROM location_history_hourly
GROUP BY time_bucket('1 day'::interval, bucket), tracker_id;

This continuous aggregate:

Groups hourly data by day and tracker ID.
Calculates the centroid of all locations in each day.
Gets the latest timestamp and nearest city in each day.

Conclusion

By removing the references to the confidence column in the match_geofences() function, we've fixed the issue with the geofence processing system. The continuous aggregate already filters out location reports with confidence > 100, so we don't need to check this condition again in the match_geofences() function.

Issue with Confidence Column​

Solution​

Testing​

Materialized View Structure​

Continuous Aggregate Definition​

Conclusion​