Data Flow
The Tracker GraphQL system follows a specific data flow from collection to display. Understanding this flow is essential for troubleshooting and maintaining the system.
Overview
Process Details
-
Data Collection:
tracker_report_fetcher.pyconnects to Apple's FindMy API and pulls location reports- It stores these reports in the
location_reportstable with timestamp, location, and hashed_adv_key - Initially, the
nearest_cityfield is NULL
-
Data Aggregation:
- TimescaleDB continuous aggregates create hourly and daily views from raw location data
- These aggregates feed into the
location_history_view - The
location_historymaterialized view is created from this view - Two mechanisms ensure the data stays up-to-date:
- Event-driven: When new data is added to
location_reports, a PostgreSQL trigger sends a notification - Time-based: The continuous aggregates are refreshed hourly using Redis-based distributed locking
- Event-driven: When new data is added to
- The
location_history_refresherservice handles both mechanisms
-
Frontend Display:
- The frontend queries the GraphQL API to get location history data
- It displays this data on a map and in tables
- For locations without a nearest city, it can request geocoding via Socket.IO
-
Geocoding Process:
- When the frontend requests geocoding for locations without a nearest city
- The request is processed through Socket.IO and added to a Redis queue
- A background process picks up tasks from the queue and uses the geocoding service
- The results are sent back to the client via Socket.IO and stored in the database
Troubleshooting
If data is not appearing in the frontend:
-
Verify data exists in the
location_reportstable -
Check if the
location_history_refresherservice is running:# For systemd
sudo systemctl status location-history-refresher
# For Docker
docker-compose -f fetcher/compose.yml ps -
Check if Redis is running (required for continuous aggregate refresh):
redis-cli ping -
Manually refresh the continuous aggregates and materialized view:
-- Refresh continuous aggregates
CALL refresh_continuous_aggregate('location_history_hourly', NOW() - INTERVAL '48 hours', NOW());
CALL refresh_continuous_aggregate('location_history_daily', NOW() - INTERVAL '7 days', NOW());
-- Refresh materialized view
REFRESH MATERIALIZED VIEW CONCURRENTLY location_history; -
Verify the continuous aggregates have data:
SELECT COUNT(*) FROM location_history_hourly;
SELECT COUNT(*) FROM location_history_daily; -
Check for authentication issues in the frontend