refactor: reorganize repository to src/ layout
Move 6 packages (core, config, data_processing, analysis, visualization, cli) into src/ to reduce root clutter. Merge tools/data.py into data_processing/transforms.py. Move docs to docs/. Path resolution via .pth file (setup_dev.py), pytest pythonpath config, and sys.path bootstrap in rxconfig.py and CLI entry points. Clean up pyproject.toml deps (remove stale pins, add snowflake-connector-python). Fix tomllib import for Python 3.10 compatibility. All 113 tests pass.
This commit is contained in:
@@ -0,0 +1,194 @@
|
||||
# Design System - HCD Analysis v2.1 (SaaS Redesign)
|
||||
|
||||
This document defines the visual design language for the UI redesign. The goal is a **modern SaaS aesthetic** - think Stripe, Linear, Vercel - while staying thematically aligned with the blue color palette.
|
||||
|
||||
**Design Philosophy**:
|
||||
- The chart is the hero; everything else supports it
|
||||
- Minimal chrome, maximum data visibility
|
||||
- Clean, confident, spacious - not clinical or governmental
|
||||
- Every pixel of vertical space matters
|
||||
|
||||
## Color Palette
|
||||
|
||||
### Primary Blues (kept from original, used sparingly)
|
||||
| Name | Hex | Usage |
|
||||
|------|-----|-------|
|
||||
| Heritage Blue | `#003087` | Top bar background, strong accents |
|
||||
| Primary Blue | `#0066CC` | Interactive elements, links, focus |
|
||||
| Vibrant Blue | `#1E88E5` | Hover states, active elements |
|
||||
| Sky Blue | `#4FC3F7` | Subtle accents, progress indicators |
|
||||
| Pale Blue | `#E3F2FD` | Selected states, subtle backgrounds |
|
||||
|
||||
### Neutrals (refined for modern feel)
|
||||
| Name | Hex | Usage |
|
||||
|------|-----|-------|
|
||||
| Slate 900 | `#0F172A` | Primary text (slightly darker) |
|
||||
| Slate 700 | `#334155` | Secondary text |
|
||||
| Slate 500 | `#64748B` | Muted text, placeholders |
|
||||
| Slate 300 | `#CBD5E1` | Borders, dividers |
|
||||
| Slate 100 | `#F8FAFC` | Backgrounds (slightly lighter) |
|
||||
| White | `#FFFFFF` | Card/modal backgrounds |
|
||||
|
||||
### Semantic Colors
|
||||
| Name | Hex | Usage |
|
||||
|------|-----|-------|
|
||||
| Success | `#10B981` | Positive (modern green) |
|
||||
| Warning | `#F59E0B` | Caution |
|
||||
| Error | `#EF4444` | Errors |
|
||||
| Info | `#3B82F6` | Informational |
|
||||
|
||||
## Typography
|
||||
|
||||
**Font Family:** Inter (primary), system-ui (fallback)
|
||||
|
||||
| Style | Size | Weight | Usage |
|
||||
|-------|------|--------|-------|
|
||||
| Display | 28px | 600 | Page titles (reduced from 32px) |
|
||||
| Heading 1 | 18px | 600 | Section headers (reduced from 24px) |
|
||||
| Heading 2 | 16px | 600 | Card titles (reduced from 20px) |
|
||||
| Heading 3 | 14px | 600 | Subsections |
|
||||
| Body | 14px | 400 | Default text |
|
||||
| Body Small | 13px | 400 | Secondary info |
|
||||
| Caption | 11px | 500 | Labels, metadata (reduced from 12px) |
|
||||
| Mono | 13px | 500 | Data values (JetBrains Mono) |
|
||||
|
||||
## Spacing Scale (Tighter)
|
||||
|
||||
| Token | Value | Usage |
|
||||
|-------|-------|-------|
|
||||
| xs | 4px | Tight gaps |
|
||||
| sm | 6px | Between related elements (was 8px) |
|
||||
| md | 8px | Standard gaps (was 12px) |
|
||||
| lg | 12px | Section padding (was 16px) |
|
||||
| xl | 16px | Card padding (was 24px) |
|
||||
| 2xl | 24px | Major gaps (was 32px) |
|
||||
| 3xl | 32px | Page margins (was 48px) |
|
||||
|
||||
## Layout Specifications
|
||||
|
||||
### Page Structure (Target)
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Logo │ Tabs │ Freshness │ 48px
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ [Initiated▾] [LastSeen▾] │ [Drugs▾] [Ind▾] [Dir▾] │ KPI badges │ 48px
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ I C I C L E C H A R T │ flex
|
||||
│ (full viewport width) │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Top Bar
|
||||
- **Height**: 48px (reduced from 64px)
|
||||
- **Background**: Heritage Blue
|
||||
- **Logo**: 28px height (reduced from 36px)
|
||||
- **Tabs**: Small pills, 28px height
|
||||
|
||||
### Filter Strip
|
||||
- **Height**: 48px (single row)
|
||||
- **Layout**: Horizontal flex, all filters inline
|
||||
- **Dropdown triggers**: 32px height, 8px padding
|
||||
- **No section header** - labels are in dropdown triggers
|
||||
- **Background**: Slate 100 or transparent
|
||||
|
||||
### KPI Section (Options)
|
||||
|
||||
**Option A: Inline badges** (preferred - zero extra height)
|
||||
```
|
||||
Filters row: [Initiated▾] [LastSeen▾] | [Drugs▾] ... | 12,345 patients • £45.2M • 89 drugs
|
||||
```
|
||||
|
||||
**Option B: Compact strip** (48px max)
|
||||
```
|
||||
┌─────┬─────┬─────┬─────┐
|
||||
│12.3K│£45M │ 89 │ 7 │ 28px value
|
||||
│pts │cost │drugs│trust│ 14px label
|
||||
└─────┴─────┴─────┴─────┘
|
||||
```
|
||||
|
||||
### Chart Container
|
||||
- **Width**: Full viewport minus 32px (16px padding each side)
|
||||
- **Height**: Fill remaining space (min 500px)
|
||||
- **No max-width constraint**
|
||||
- **Margins**: Minimal (t:40, l:8, r:8, b:24)
|
||||
|
||||
## Component Specifications
|
||||
|
||||
### Compact Dropdown Trigger
|
||||
- Height: 32px
|
||||
- Padding: 8px 12px
|
||||
- Border: 1px Slate 300
|
||||
- Border radius: 6px
|
||||
- Font: 13px
|
||||
- Chevron: 14px icon
|
||||
|
||||
### Compact KPI Badge
|
||||
- Padding: 4px 12px
|
||||
- Border radius: 16px (pill)
|
||||
- Background: Slate 100
|
||||
- Value: 14px mono, weight 600
|
||||
- Label: 11px, Slate 500
|
||||
|
||||
### Searchable Dropdown Panel
|
||||
- Max height: 200px (items area)
|
||||
- Item padding: 6px 8px
|
||||
- Search input height: 28px
|
||||
- Width: 240px min
|
||||
|
||||
## Shadows
|
||||
|
||||
| Token | Value | Usage |
|
||||
|-------|-------|-------|
|
||||
| sm | `0 1px 2px rgba(0,0,0,0.04)` | Subtle (lighter) |
|
||||
| md | `0 1px 3px rgba(0,0,0,0.06)` | Cards at rest |
|
||||
| lg | `0 4px 8px rgba(0,0,0,0.08)` | Dropdowns, hover |
|
||||
|
||||
## Border Radius
|
||||
|
||||
| Token | Value | Usage |
|
||||
|-------|-------|-------|
|
||||
| sm | 4px | Small elements |
|
||||
| md | 6px | Inputs, buttons |
|
||||
| lg | 8px | Cards |
|
||||
| full | 9999px | Pills, badges |
|
||||
|
||||
## Transitions
|
||||
|
||||
All transitions: 150ms ease-out (faster than before)
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
### Key Changes from v2.0
|
||||
1. **Vertical space reduction**: ~210px saved (364px → ~156px overhead)
|
||||
2. **Full-width chart**: Remove PAGE_MAX_WIDTH for chart
|
||||
3. **Inline KPIs**: Either badges in filter row or minimal strip
|
||||
4. **Smaller fonts**: Headlines and captions reduced
|
||||
5. **Tighter spacing**: All spacing tokens reduced by ~25%
|
||||
|
||||
### CSS Patterns
|
||||
```css
|
||||
/* Full-height chart container */
|
||||
.chart-container {
|
||||
height: calc(100vh - 96px); /* viewport minus top bar + filter strip */
|
||||
min-height: 500px;
|
||||
width: calc(100vw - 32px);
|
||||
margin: 0 16px;
|
||||
}
|
||||
|
||||
/* Filter strip */
|
||||
.filter-strip {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
height: 48px;
|
||||
gap: 12px;
|
||||
padding: 0 16px;
|
||||
}
|
||||
```
|
||||
|
||||
### Reflex Implementation
|
||||
- Use `height="calc(100vh - 96px)"` for chart container
|
||||
- Use `width="100%"` with `padding_x="16px"` for full-width
|
||||
- Use `flex="1"` to let chart grow
|
||||
- Keep `min_height="500px"` as fallback
|
||||
@@ -0,0 +1,192 @@
|
||||
# Snowflake Reference
|
||||
|
||||
Essential database context for querying NHS data. Read this every iteration when working with Snowflake.
|
||||
|
||||
---
|
||||
|
||||
## Snowflake MCP Server
|
||||
|
||||
Use `mcp__snowflake-mcp__*` functions to explore schema and test queries.
|
||||
|
||||
### Schema Discovery (USE THESE FIRST)
|
||||
- `test_connection()` - Verify connectivity
|
||||
- `list_databases()` - List accessible databases
|
||||
- `list_schemas(database_name)` - List schemas in a database
|
||||
- `list_tables(database, schema)` - List tables with descriptions
|
||||
- `list_views(schema_name, database)` - List views with descriptions
|
||||
- `describe_table(table_name, database)` - Get detailed table schema
|
||||
- `describe_query(query, database)` - Preview query output columns without execution
|
||||
|
||||
### Query Execution
|
||||
- `read_data(query, database, max_rows)` - Execute SELECT queries with row limits
|
||||
- `read_data_paginated(query, database, page_size, page)` - Paginated results with total count
|
||||
- `read_data_pandas(query, database, max_rows, output_format)` - Results in pandas-friendly formats
|
||||
|
||||
### Async Query Support (long-running queries)
|
||||
- `execute_async(query, database)` - Submit asynchronously, returns query_id
|
||||
- `get_query_status(query_id, database)` - Check status
|
||||
- `get_async_results(query_id, database, max_rows)` - Retrieve results
|
||||
|
||||
### Usage Guidelines
|
||||
- **ALWAYS** verify table structures and column names via MCP before writing queries
|
||||
- Test with small result sets (`LIMIT 20`) before full execution
|
||||
- Use `describe_query` to preview complex query outputs before running
|
||||
- Use async queries for operations expected to take >30 seconds
|
||||
|
||||
---
|
||||
|
||||
## Database Overview
|
||||
|
||||
| Database | Purpose |
|
||||
|----------|---------|
|
||||
| `DATA_HUB` | **Analyst-curated** data warehouse - primary source for most queries |
|
||||
| `PRIMARY_CARE` | Raw extracts from EMIS and TPP clinical systems |
|
||||
| `NATIONAL` | NHS England national datasets (SUS, ECDS, MHSDS, etc.) |
|
||||
| `FACTS_AND_DIMENSIONS_ALL_DATA` | External reference data (BNF, SNOMED, QOF clusters) |
|
||||
| `REPORTING_DATASETS_ICB` | Reporting outputs and analyst workspaces (includes SCRATCHPAD) |
|
||||
|
||||
**Avoid**: `SYSTEM` database.
|
||||
|
||||
---
|
||||
|
||||
## Key Tables and Views
|
||||
|
||||
### DATA_HUB.DWH (Dimensions)
|
||||
|
||||
| View | Purpose | Key Columns |
|
||||
|------|---------|-------------|
|
||||
| `DimMedicineAndDevice` | Master medication/device reference | `ProductSnomedCode`, `TherapeuticMoietySnomedCode` (VTM), `BNFParagraphCode`, `StrengthDescription`, `ProductDescription` |
|
||||
| `DimPerson` | Patient demographics | `PatientPseudonym`, `PersonKey`, `CurrentGeneralPractice`, `IsCurrentNWRegistered`, `YearMonthBirth` |
|
||||
| `DimSnomedCode` | SNOMED code descriptions | `SnomedCode`, `SnomedDescription` |
|
||||
| `DimOrganisationAndSite` | GP practices and NHS orgs | `SiteCode`, `OrganisationName`, `OrganisationSubType`, `IsSiteNorfolkAndWaveney`, `IsSiteActive` |
|
||||
| `DimDate` | Date dimension | |
|
||||
| `DimCondition` | Clinical conditions | Long-term condition flags |
|
||||
| `DimDeprivation` | Deprivation rankings by area | |
|
||||
|
||||
**CRITICAL**:
|
||||
- `ProductDescription` is the correct column for product names. `ProductName` does NOT exist.
|
||||
- `IsLatest` does NOT exist in `DimMedicineAndDevice`.
|
||||
|
||||
### DATA_HUB.CDM (Common Data Model)
|
||||
|
||||
| View | Purpose | Key Columns |
|
||||
|------|---------|-------------|
|
||||
| `Acute__Conmon__PatientLevelDrugs` | HCD activity data | `PseudoNHSNoLinked`, `InterventionDate`, `DrugName`, `Price Actual` |
|
||||
|
||||
**Note**: HCD `PseudoNHSNoLinked` = GP `PatientPseudonym` for patient linkage.
|
||||
|
||||
### DATA_HUB.PHM (Population Health Management)
|
||||
|
||||
| View | Purpose | Key Columns |
|
||||
|------|---------|-------------|
|
||||
| `PrimaryCareClinicalCoding` | **Unified** clinical coding (EMIS + TPP, no duplicates) | `PatientPseudonym`, `SNOMEDCode`, `EventDateTime`, `NumericValue` |
|
||||
| `PrimaryCareMedication` | **Unified** medication data (EMIS + TPP, no duplicates) | `PatientPseudonym`, `SNOMEDCode`, `DateMedicationStart`, `Quantity` |
|
||||
| `ClinicalCodingClusterSnomedCodes` | SNOMED codes grouped by cluster | `ClusterId`, `SnomedCode` |
|
||||
| `PersonCohort` | Pre-defined patient cohorts | |
|
||||
|
||||
**Prefer DATA_HUB.PHM unified views** over raw PRIMARY_CARE tables.
|
||||
|
||||
---
|
||||
|
||||
## Patient Identifiers
|
||||
|
||||
| Identifier | Source | Usage |
|
||||
|------------|--------|-------|
|
||||
| `PatientPseudonym` | DATA_HUB, NATIONAL | Primary - use for most joins |
|
||||
| `PseudoNHSNoLinked` | DATA_HUB.CDM (HCD data) | Links to PatientPseudonym |
|
||||
| `PersonKey` | DATA_HUB.DWH.DimPerson | Integer key for person dimension |
|
||||
|
||||
### Standard Join Patterns
|
||||
```sql
|
||||
-- HCD Activity to GP Diagnosis
|
||||
FROM DATA_HUB.CDM."Acute__Conmon__PatientLevelDrugs" hcd
|
||||
LEFT JOIN DATA_HUB.PHM."PrimaryCareClinicalCoding" pcc
|
||||
ON hcd."PseudoNHSNoLinked" = pcc."PatientPseudonym"
|
||||
|
||||
-- Activity to Person Demographics
|
||||
FROM DATA_HUB.CDM."Acute__Conmon__PatientLevelDrugs" hcd
|
||||
INNER JOIN DATA_HUB.DWH."DimPerson" dp
|
||||
ON hcd."PseudoNHSNoLinked" = dp."PatientPseudonym"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## CRITICAL: Registered Population Filter
|
||||
|
||||
**ALWAYS** apply when counting patients:
|
||||
|
||||
```sql
|
||||
WHERE dp."IsCurrentNWRegistered" = 'Yes'
|
||||
AND dp."CurrentGeneralPractice" <> '*'
|
||||
```
|
||||
|
||||
Without this filter, counts will be ~2x inflated (includes deceased, deregistered, out-of-area patients).
|
||||
|
||||
---
|
||||
|
||||
## Query Development Patterns
|
||||
|
||||
### Clinical Condition Detection (GP SNOMED Clusters)
|
||||
```sql
|
||||
-- Get all SNOMED codes for a clinical cluster
|
||||
SELECT "SnomedCode"
|
||||
FROM DATA_HUB.PHM."ClinicalCodingClusterSnomedCodes"
|
||||
WHERE "ClusterId" = 'RARTH_COD' -- Rheumatoid arthritis
|
||||
|
||||
-- Check if patient has condition
|
||||
SELECT DISTINCT pcc."PatientPseudonym"
|
||||
FROM DATA_HUB.PHM."PrimaryCareClinicalCoding" pcc
|
||||
WHERE pcc."SNOMEDCode" IN (SELECT "SnomedCode" FROM cluster_codes)
|
||||
AND pcc."PatientPseudonym" IS NOT NULL
|
||||
```
|
||||
|
||||
### Available SNOMED Clusters for HCD Indications
|
||||
- `RARTH_COD` (155 codes) - Rheumatoid arthritis
|
||||
- `PSORIASIS_COD` (116 codes) - Psoriasis
|
||||
- `CROHNS_COD` (93 codes) - Crohn's disease
|
||||
- `ULCCOLITIS_COD` (62 codes) - Ulcerative colitis
|
||||
- `MS_COD` (44 codes) - Multiple sclerosis
|
||||
- `DM_COD` / `DMTYPE1_COD` / `DMTYPE2AUDIT_COD` - Diabetes
|
||||
|
||||
### Sample HCD Activity Query
|
||||
```sql
|
||||
SELECT
|
||||
hcd."PseudoNHSNoLinked" AS PatientPseudonym,
|
||||
hcd."DrugName",
|
||||
hcd."InterventionDate",
|
||||
hcd."Provider Code",
|
||||
hcd."OrganisationName"
|
||||
FROM DATA_HUB.CDM."Acute__Conmon__PatientLevelDrugs" hcd
|
||||
WHERE hcd."InterventionDate" >= '2024-01-01'
|
||||
LIMIT 20
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Snowflake SQL Syntax
|
||||
|
||||
- Double-quote identifiers: `"PatientPseudonym"`
|
||||
- Date literals: `'2025-04-01'::DATE`
|
||||
- Date functions: `DATEADD('MONTH', -3, date)`, `DATEDIFF('YEAR', d1, d2)`, `LAST_DAY(date)`
|
||||
- Boolean: `TRUE`/`FALSE`
|
||||
- No `TOP N` - use `LIMIT N`
|
||||
- `COALESCE()`, `NULLIF()`, `GREATEST()` work as expected
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Column not found errors
|
||||
1. Use `describe_table(table_name, database)` to get actual column names
|
||||
2. Remember: Snowflake identifiers are case-sensitive when quoted
|
||||
3. Common mistakes: `ProductName` (wrong) vs `ProductDescription` (correct)
|
||||
|
||||
### Empty results
|
||||
1. Check patient identifier filtering (`IS NOT NULL`)
|
||||
2. Check date ranges
|
||||
3. Test with `LIMIT 20` first to see sample data
|
||||
|
||||
### Slow queries
|
||||
1. Add `LIMIT` during development
|
||||
2. Use `describe_query` to validate structure before execution
|
||||
3. Consider async execution for large result sets
|
||||
Reference in New Issue
Block a user