Agent skill
spark-principal-engineer
Principal/Senior-level Spark playbook for batch and streaming architecture, data layout, shuffle control, cost-efficient compute, reliability, and operating large-scale data processing platforms. Use when: designing Spark workloads, tuning jobs, optimizing cluster economics, or operating Spark platforms in production.
Install this agent skill to your Project
npx add-skill https://github.com/mOdrA40/claude-codex-skills-directory/tree/main/data-skills/spark-mastery-skill
SKILL.md
Spark Mastery (Senior → Principal)
Operate
- Start from data volume, compute economics, shuffle behavior, and correctness requirements.
- Treat Spark as a distributed execution system with real storage, network, and scheduling tradeoffs.
- Prefer explicit workload design over vague “big data” assumptions.
- Optimize for predictable cost, reliability, and debuggable pipelines.
Default Standards
- Data layout and partitioning must match workload reality.
- Shuffle-heavy patterns require scrutiny.
- Memory and executor tuning should follow evidence.
- Streaming and batch semantics must be separated clearly.
- Platform cost and job performance should be evaluated together.
References
- Job architecture and dataflow design: references/job-architecture-and-dataflow-design.md
- Partitioning, shuffle, and data layout: references/partitioning-shuffle-and-data-layout.md
- Memory, executors, and runtime tuning: references/memory-executors-and-runtime-tuning.md
- Structured streaming and stateful semantics: references/structured-streaming-and-stateful-semantics.md
- Lakehouse integration, storage, and table formats: references/lakehouse-integration-storage-and-table-formats.md
- Multi-tenant governance and cost control: references/multi-tenant-governance-and-cost-control.md
- Reliability and operations: references/reliability-and-operations.md
- Incident runbooks: references/incident-runbooks.md
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
nuxt-tanstack-mastery
Panduan senior/lead developer 20 tahun pengalaman untuk Vue.js 3 + Nuxt 3 + TanStack Query development. Gunakan skill ini ketika: (1) Membuat project Nuxt 3 baru dengan arsitektur production-ready, (2) Integrasi TanStack Query untuk data fetching, (3) Debugging Vue/Nuxt yang kompleks, (4) Review code untuk clean code compliance, (5) Optimisasi performa aplikasi Vue/Nuxt, (6) Setup folder structure yang scalable, (7) Mencari library terpercaya untuk Vue ecosystem, (8) Menghindari common pitfalls dan bugs, (9) Implementasi state management patterns, (10) Security hardening aplikasi Nuxt. Trigger keywords: vue, vuejs, nuxt, nuxtjs, tanstack, vue-query, composition api, pinia, vueuse, vue router, clean code vue, debugging vue, folder structure nuxt.
solidjs-solidstart-expert
Expert-level SolidJS and SolidStart development skill with 20+ years senior/lead engineer mindset. Comprehensive guidance for building production-ready, scalable web applications with fine-grained reactivity. Use when Claude needs to: (1) Create new SolidJS/SolidStart projects, (2) Implement TanStack Query/Router/Table/Form integration, (3) Build reactive components with signals/stores/resources, (4) Handle SSR/SSG/streaming with SolidStart, (5) Implement authentication and API routes, (6) Optimize bundle size and performance, (7) Debug reactivity issues and memory leaks, (8) Structure large-scale applications, (9) Implement type-safe patterns with TypeScript, (10) Handle error boundaries and suspense, (11) Build accessible UI components, (12) Deploy to Vercel/Netlify/Cloudflare. Triggers: "solid", "solidjs", "solidstart", "createSignal", "createStore", "createResource", "tanstack solid", "vinxi", "fine-grained reactivity".
react-tanstack-senior
Expertise senior/lead React developer 20 tahun dengan TanStack ecosystem (Query, Router, Table, Form, Start). Gunakan skill ini ketika: (1) Membuat aplikasi React dengan TanStack libraries, (2) Review/refactor kode React untuk clean code, (3) Debugging React/TanStack issues, (4) Setup project structure yang maintainable, (5) Optimasi performa React apps, (6) Memilih library yang tepat untuk use case tertentu, (7) Mencegah common bugs dan memory leaks, (8) Implementasi best practices KISS dan less is more. Trigger keywords: React, TanStack, React Query, TanStack Router, TanStack Table, TanStack Form, TanStack Start, Vinxi, clean code, refactor, performance, debugging.
clickhouse-principal-engineer
Principal/Senior-level ClickHouse playbook for analytical schema design, partitioning, ingestion, query performance, replication, storage strategy, and operating large-scale columnar systems. Use when: designing OLAP workloads, reviewing MergeTree layout, tuning analytical queries, building event analytics platforms, or operating ClickHouse in production.
mysql-principal-engineer
Principal/Senior-level MySQL playbook for schema design, indexing, transactions, replication, operational reliability, online migrations, and production workload tuning. Use when: designing relational systems, reviewing query/index strategy, operating MySQL fleets, debugging contention or replication lag, or hardening MySQL-backed applications.
mongodb-principal-engineer
Principal/Senior-level MongoDB playbook for document modeling, indexing, replication, sharding, query design, observability, and production reliability. Use when: designing document schemas, reviewing aggregation/query performance, operating replicas/shards, or hardening MongoDB-backed systems.
Didn't find tool you were looking for?