16 MAR 2026
David asks: Questioned whether Canvas dual-endpoint sync creates duplicates
David noticed that Canvas syncs from both the files API and the modules API and questioned whether this would duplicate file records, prompting an investigation into the deduplication strategy.
David asked an agent to study how Canvas syncing and change detection works, specifically raising whether the system introduces duplication:
do we sync from both the files endpoint and modules endpoint for canvas specifically? but isnt that an issue cause isnt modules a subset of the files endpoint so youd be duplicating?
The investigation confirmed that both endpoints are indeed fetched in parallel, but a four-phase linking process prevents duplication: files are stored with is_in_module: false initially, then cross-linked to module topics via lms_content_id, and the is_in_module flag is flipped to true once linked. Files not in any module are surfaced in a synthetic "Course Files" virtual module. David's framing anticipated the potential failure mode correctly; the code handled it deliberately.