chore: cleanup archive, update ralph loop files, add analytics spec

- Remove old iteration logs and deprecated files from archive/can_delete/
- Update RALPH_PROMPT.md and guardrails.md for Phase 10+ work
- Update ralph.ps1 banner text
- Add AdditionalAnalytics.md chart specification
- Add run.bat convenience script
This commit is contained in:
Andrew Charlwood
2026-02-07 02:10:03 +00:00
parent 9171563651
commit a943bee8f2
22 changed files with 375 additions and 4462 deletions
+87
View File
@@ -169,6 +169,93 @@ If you discover a new failure pattern during your work, add it to this file.
- **Rule**: Run `python run_dash.py` (or `python -c "from dash_app.app import app"` for import checks). The app must start without errors after EVERY task.
- **Why**: Broken imports or circular dependencies compound across tasks. Catch them immediately.
---
## Icicle Chart Hierarchy Integrity
### Drug/trust/directorate filters must preserve ancestor nodes
- **When**: Building WHERE clauses in `pathway_queries.py:load_pathway_nodes()` that filter by drug, trust, or directorate
- **Rule**: Ancestor nodes (levels 0, 1, 2) must ALWAYS be included in query results regardless of drug/trust/directorate filters. Only apply those filters to level 3+ nodes. The icicle chart requires an unbroken parent→child chain from a single root node. If you filter out the root, trust, or directory-level nodes, Plotly will report "Multiple implied roots" and the chart will be blank.
- **Why**: The drug filter `drug_sequence LIKE %DRUG%` was applied to all levels, dropping level 0-2 nodes that have NULL drug_sequence. This broke the icicle hierarchy.
### Pattern-matching IDs must be globally unique
- **When**: Creating components with pattern-matching IDs like `{"type": "drug-fragment", "index": "..."}`
- **Rule**: The `index` value must be globally unique across the ENTIRE layout, not just within a section. If the same entity (e.g., a drug fragment name) can appear in multiple contexts (e.g., multiple indications within a directorate), the ID must include ALL distinguishing context. Use `f"{directorate}|{search_term}|{fragment}"` not just `f"{directorate}|{fragment}"`.
- **Why**: RIVAROXABAN appeared under multiple indications within CARDIOLOGY, causing `DuplicateIdError` on first page load.
---
## UI Structure
### Sidebar is for views, filter bar is for filters
- **When**: Deciding where to place UI controls
- **Rule**: The sidebar should contain chart VIEW selectors (Icicle, Sankey, Timeline) and navigation. Filter controls (drug selection, trust selection, date range, chart type toggle) belong in the top filter bar. Do NOT put filter triggers in the sidebar.
- **Why**: Having 3-4 sidebar buttons that all open the same drawer was confusing. Filters should be grouped together in the persistent filter bar.
### Use dmc.Modal for filter selection, not dmc.Drawer
- **When**: Building filter selection UI (drugs, trusts, directorates)
- **Rule**: Use `dmc.Modal` (centered popup) for filter selection, not `dmc.Drawer` (slide-in panel). Each filter type gets its own modal. Trigger buttons go in the filter bar with selection count badges (e.g., "Drugs (3)").
- **Why**: A single drawer with all filters required excessive scrolling and was confusing. Modals provide focused, per-filter selection with clear open/close behavior.
### Use frontend-developer agent for UX decisions
- **When**: Designing modal layouts, filter bar organization, or other UX-sensitive components
- **Rule**: Spawn the `frontend-developer` agent to review the data shapes and recommend optimal layout, sizing, and interaction patterns before building.
- **Why**: Good UX requires considering the data (42 drugs, 7 trusts, 19 directorates × 163 indications) and choosing appropriate patterns for each.
## Tab Architecture (Phase 9)
### Only render the active tab's chart
- **When**: Building the tab switching callback in `chart.py`
- **Rule**: Check `app-state.active_tab` and ONLY compute the figure for the active tab. Return `no_update` or a cached placeholder for inactive tabs. Do NOT render all 8 charts on every filter change.
- **Why**: Computing 8 Plotly figures on every filter change would be extremely slow. Lazy rendering is essential — only the visible chart needs computation.
### Chart figure functions go in src/visualization/, not dash_app/
- **When**: Creating new chart figures (market share, lollipop, waterfall, Sankey, etc.)
- **Rule**: Create figure builder functions in `src/visualization/` (e.g., `plotly_generator.py` or new files). Dash callbacks call these shared functions. Do NOT put Plotly figure construction logic directly in `dash_app/callbacks/`.
- **Why**: Shared figure functions can be tested independently and reused outside Dash. Same pattern as `create_icicle_from_nodes()`.
### New query functions use same pattern as existing ones
- **When**: Adding query functions to `src/data_processing/pathway_queries.py`
- **Rule**: Follow the same pattern as `load_pathway_nodes()`: accept `db_path` parameter, use `sqlite3.connect()` with `row_factory = sqlite3.Row`, parameterized queries, return JSON-serializable dicts/lists. Add thin wrappers in `dash_app/data/queries.py`.
- **Why**: Consistency with existing code. The thin wrapper pattern ensures DB path resolution is centralized.
### Parsing utilities must handle missing/null data gracefully
- **When**: Parsing `average_spacing` HTML strings or `ids` column values
- **Rule**: Always handle `None`, empty string `""`, and malformed data. Return sensible defaults (empty dict, empty list) rather than raising exceptions. The pathway_nodes data has NULLs at ancestor levels and some nodes may have empty statistics.
- **Why**: Not all nodes have `average_spacing` or `average_administered` populated. Level 0-2 nodes have no drug-level statistics.
### Tab bar replaces sidebar chart views
- **When**: Implementing tab infrastructure in Task 9.1
- **Rule**: Add the tab bar INSIDE `chart_card.py`. Remove the "Chart Views" section from `sidebar.py` (Icicle/Sankey/Timeline items). The sidebar should only have the "Overview" section after this change.
- **Why**: Tabs are the standard UI pattern for switching between chart views in the same content area. The sidebar was a temporary placeholder.
## Two-View Architecture (Phase 10)
### Patient Pathways and Trust Comparison are separate views with separate state
- **When**: Building the two-view navigation (sidebar switching between Patient Pathways and Trust Comparison)
- **Rule**: `active_view` in app-state controls which view is shown. Patient Pathways has `active_tab` (icicle/sankey) + drug/trust/directorate filters. Trust Comparison has `selected_comparison_directorate` (null = landing page, string = dashboard). Drug/trust/directorate filters do NOT affect Trust Comparison. Date filter + chart type toggle are GLOBAL and affect both views.
- **Why**: The whole point of the restructure is that these views have different analytical perspectives. Leaking Patient Pathways filters into Trust Comparison would break the directorate-focused comparison.
### Trust Comparison queries must break down BY trust, not aggregate across trusts
- **When**: Writing query functions for Trust Comparison charts (Task 10.6)
- **Rule**: Existing Phase 9 query functions (e.g., `get_drug_market_share`) aggregate across trusts for same (directory, drug) pair. Trust Comparison needs the OPPOSITE — show per-trust breakdown within a single directorate. Create NEW query functions (`get_trust_market_share`, `get_trust_cost_waterfall`, etc.) rather than modifying existing ones. Existing functions are still used by Patient Pathways.
- **Why**: Modifying existing queries would break Patient Pathways charts. The trust-comparison perspective is fundamentally different from the existing aggregate perspective.
### Cost Effectiveness in Trust Comparison is directorate-scoped but NOT trust-split
- **When**: Building the Cost Effectiveness chart in the Trust Comparison 6-chart dashboard
- **Rule**: Cost Effectiveness shows pathway costs for the selected directorate only. It does NOT split by trust — it's a single chart showing pathways within the directorate. Use existing `get_pathway_costs(directory=selected)` query, not a new trust-split variant.
- **Why**: Splitting pathway costs by trust muddies the water — pathway analysis is more meaningful at the directorate level.
### Use /frontend-design skill for UX design, not frontend-developer agent
- **When**: Designing layouts for header, sub-header, landing page, dashboard, filter placement (Task 10.1)
- **Rule**: Use the `/frontend-design` skill (invoked via the Skill tool) for design consultation. Do NOT spawn a `frontend-developer` Task agent.
- **Why**: User explicitly requested the frontend-design skill for design work.
### Trust Comparison landing page respects chart type toggle
- **When**: Building the directorate/indication selector buttons (Task 10.7)
- **Rule**: When chart type is "By Directory", show ~14 directorate buttons. When "By Indication", show ~32 indication buttons. The button list comes from reference-data store (available_directorates or available_indications). Changing the chart type toggle while on the landing page should refresh the button list. If a directorate was selected and user switches to indication mode, clear `selected_comparison_directorate` and return to landing page.
- **Why**: The chart type toggle is global and the Trust Comparison view must respect it.
<!--
ADD NEW GUARDRAILS BELOW as failures are observed during the loop.