Skip to main content

Data Flow

The Tracker GraphQL system follows a specific data flow from collection to display. Understanding this flow is essential for troubleshooting and maintaining the system.

Overview

Process Details

  1. Data Collection:

    • tracker_report_fetcher.py connects to Apple's FindMy API and pulls location reports
    • It stores these reports in the location_reports table with timestamp, location, and hashed_adv_key
    • Initially, the nearest_city field is NULL
  2. Data Aggregation:

    • TimescaleDB continuous aggregates create hourly and daily views from raw location data
    • These aggregates feed into the location_history_view
    • The location_history materialized view is created from this view
    • Two mechanisms ensure the data stays up-to-date:
      • Event-driven: When new data is added to location_reports, a PostgreSQL trigger sends a notification
      • Time-based: The continuous aggregates are refreshed hourly using Redis-based distributed locking
    • The location_history_refresher service handles both mechanisms
  3. Frontend Display:

    • The frontend queries the GraphQL API to get location history data
    • It displays this data on a map and in tables
    • For locations without a nearest city, it can request geocoding via Socket.IO
  4. Geocoding Process:

    • When the frontend requests geocoding for locations without a nearest city
    • The request is processed through Socket.IO and added to a Redis queue
    • A background process picks up tasks from the queue and uses the geocoding service
    • The results are sent back to the client via Socket.IO and stored in the database

Troubleshooting

If data is not appearing in the frontend:

  1. Verify data exists in the location_reports table

  2. Check if the location_history_refresher service is running:

    # For systemd
    sudo systemctl status location-history-refresher

    # For Docker
    docker-compose -f fetcher/compose.yml ps
  3. Check if Redis is running (required for continuous aggregate refresh):

    redis-cli ping
  4. Manually refresh the continuous aggregates and materialized view:

    -- Refresh continuous aggregates
    CALL refresh_continuous_aggregate('location_history_hourly', NOW() - INTERVAL '48 hours', NOW());
    CALL refresh_continuous_aggregate('location_history_daily', NOW() - INTERVAL '7 days', NOW());

    -- Refresh materialized view
    REFRESH MATERIALIZED VIEW CONCURRENTLY location_history;
  5. Verify the continuous aggregates have data:

    SELECT COUNT(*) FROM location_history_hourly;
    SELECT COUNT(*) FROM location_history_daily;
  6. Check for authentication issues in the frontend