Agent skill

mimic-table-relationships

Understand MIMIC-IV table relationships, join patterns, and identifier hierarchy. Use for correct data linkage, avoiding duplicates, and proper temporal joins.

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/mimic-table-relationships

Metadata

Additional technical details for this skill

author
m4-clinical-extraction
version
1.0
category
data-quality
database
mimic-iv
validated
YES

SKILL.md

MIMIC-IV Table Relationships

Understanding the identifier hierarchy and table relationships is essential for correct query construction. Incorrect joins can cause data duplication or missing records.

When to Use This Skill

  • Writing complex queries joining multiple tables
  • Understanding why queries return unexpected row counts
  • Debugging duplicate or missing data issues
  • Learning MIMIC-IV data structure

Identifier Hierarchy

subject_id (patient)
    └── hadm_id (hospital admission)
            └── stay_id (ICU stay)
                    └── Events (chartevents, labevents, etc.)

subject_id

  • Unique per patient
  • Persists across all hospitalizations and ICU stays
  • Links to: patients table

hadm_id

  • Unique per hospital admission
  • One patient can have multiple hadm_ids (readmissions)
  • Links to: admissions, diagnoses_icd, prescriptions, most lab/hospital tables

stay_id

  • Unique per ICU stay
  • One hospital admission can have multiple stay_ids (ICU readmission)
  • Links to: icustays, chartevents, ICU-specific tables

Core Table Relationships

Hospital Module (mimiciv_hosp)

sql
patients             -- 1 row per subject_id
    |
    +-- admissions   -- 1 row per hadm_id
    |       |
    |       +-- diagnoses_icd
    |       +-- procedures_icd
    |       +-- prescriptions
    |       +-- labevents
    |       +-- microbiologyevents
    |
    +-- transfers    -- Multiple per hadm_id (ward movements)

ICU Module (mimiciv_icu)

sql
icustays            -- 1 row per stay_id
    |
    +-- chartevents  -- Vitals, assessments
    +-- inputevents  -- Medications, fluids
    +-- outputevents -- Urine, drains
    +-- procedureevents
    +-- datetimeevents

Common Join Patterns

Patient -> Hospital -> ICU

sql
SELECT p.subject_id, a.hadm_id, ie.stay_id
FROM mimiciv_hosp.patients p
INNER JOIN mimiciv_hosp.admissions a
    ON p.subject_id = a.subject_id
INNER JOIN mimiciv_icu.icustays ie
    ON a.hadm_id = ie.hadm_id;

Labs to ICU Stay (Time-Bounded)

sql
-- Labs drawn during ICU stay only
SELECT ie.stay_id, le.charttime, le.valuenum
FROM mimiciv_icu.icustays ie
INNER JOIN mimiciv_hosp.labevents le
    ON ie.hadm_id = le.hadm_id
    AND le.charttime >= ie.intime
    AND le.charttime <= ie.outtime;

Labs Within N Hours of ICU Admission

sql
-- First 24 hours
SELECT ie.stay_id, le.charttime, le.valuenum
FROM mimiciv_icu.icustays ie
INNER JOIN mimiciv_hosp.labevents le
    ON ie.hadm_id = le.hadm_id
    AND le.charttime >= ie.intime
    AND le.charttime <= DATETIME_ADD(ie.intime, INTERVAL 24 HOUR);

Critical Join Warnings

1. Hospital Labs Duplicate Across ICU Stays

If a patient has multiple ICU stays in one hospitalization, joining labs by hadm_id only will duplicate lab values:

sql
-- WRONG: Duplicates labs for patients with multiple ICU stays
SELECT ie.stay_id, le.*
FROM mimiciv_icu.icustays ie
INNER JOIN mimiciv_hosp.labevents le
    ON ie.hadm_id = le.hadm_id;  -- No time filter!

-- CORRECT: Add time bounds
SELECT ie.stay_id, le.*
FROM mimiciv_icu.icustays ie
INNER JOIN mimiciv_hosp.labevents le
    ON ie.hadm_id = le.hadm_id
    AND le.charttime BETWEEN ie.intime AND ie.outtime;

2. Derived Tables Already Filtered

Many mimiciv_derived tables are pre-joined to ICU stays:

sql
-- These already have stay_id and time-bounded data
SELECT * FROM mimiciv_derived.vitalsign;  -- Already per stay_id
SELECT * FROM mimiciv_derived.chemistry;  -- Has subject_id and hadm_id

3. Multiple Measurements Per Time Point

Aggregate or select appropriately:

sql
-- Get worst GCS per hour
SELECT stay_id,
       DATETIME_TRUNC(charttime, HOUR) AS hour,
       MIN(gcs) AS worst_gcs
FROM mimiciv_derived.gcs
GROUP BY stay_id, DATETIME_TRUNC(charttime, HOUR);

Cardinality Reference

Relationship Cardinality
subject_id : hadm_id 1 : many
hadm_id : stay_id 1 : many
stay_id : chartevents 1 : many
hadm_id : labevents 1 : many
hadm_id : diagnoses_icd 1 : many
stay_id : derived tables 1 : many (usually)

Example: Verify Join Correctness

sql
-- Check for unexpected duplicates
WITH joined AS (
    SELECT ie.stay_id, COUNT(*) AS n_labs
    FROM mimiciv_icu.icustays ie
    INNER JOIN mimiciv_hosp.labevents le
        ON ie.hadm_id = le.hadm_id
    GROUP BY ie.stay_id
)
SELECT
    COUNT(*) AS n_stays,
    AVG(n_labs) AS avg_labs_per_stay,
    MAX(n_labs) AS max_labs  -- Very high = possible duplication
FROM joined;

BigQuery vs PostgreSQL Syntax

MIMIC concepts are written for BigQuery. Key differences:

BigQuery PostgreSQL
DATETIME_ADD(x, INTERVAL '1' HOUR) x + INTERVAL '1 hour'
DATETIME_DIFF(a, b, HOUR) EXTRACT(EPOCH FROM (a - b))/3600
DATETIME_TRUNC(x, HOUR) DATE_TRUNC('hour', x)
physionet-data.mimiciv_* mimiciv_* (schema)

References

  • MIMIC-IV Documentation: https://mimic.mit.edu/docs/iv/
  • Johnson AEW et al. "MIMIC-IV, a freely accessible electronic health record dataset." Scientific Data. 2023.

Didn't find tool you were looking for?

Be as detailed as possible for better results