Agent skill

databricks-2025

This skill should be used when the user asks about "Databricks Job activity", "DatabricksJob in ADF", "ServiceNow V2 connector", "ADF managed identity", or needs guidance on Databricks serverless linked services, Snowflake connectors, or MFA enforcement.

Stars 22
Forks 3

Install this agent skill to your Project

npx add-skill https://github.com/JosiahSiegel/claude-plugin-marketplace/tree/main/plugins/adf-master/skills/databricks-2025

SKILL.md

Azure Data Factory Databricks Integration 2025

Databricks Job Activity (Recommended 2025)

CRITICAL UPDATE (2025): The Databricks Job activity is now the ONLY recommended method for orchestrating Databricks in ADF. Microsoft strongly recommends migrating from legacy Notebook, Python, and JAR activities.

Quick Reference

  • Activity type: DatabricksJob (NOT DatabricksSparkJob or DatabricksNotebook)
  • Parameter property: jobParameters (NOT parameters)
  • Linked service auth: Managed Identity ("authentication": "MSI") recommended
  • Cluster config: Do NOT specify cluster properties in linked service; the Databricks Job controls compute

Why Databricks Job Activity?

Feature Notebook Activity (Legacy) Job Activity (2025)
Compute Must configure cluster in linked service Serverless by default
Workflow tasks Single notebook Multi-task DAGs (notebook, Python, SQL, DLT)
Retry ADF-level only Job-level + task-level
Repair runs Not supported Rerun failed tasks only
Git integration Limited Full Databricks Git support + DABs
Lineage None Built-in data lineage
If/Else logic Must use ADF control flow Native If/Else task types

Benefits Summary

  1. Serverless Execution -- No cluster specification needed; automatic serverless compute with faster startup and lower costs
  2. Advanced Workflow Features -- Run As, Task Values, Conditional Execution, AI/BI Tasks, Repair Runs, Notifications, Queuing
  3. Centralized Job Management -- Jobs defined once in Databricks workspace; single source of truth with Git-backed versioning
  4. Cost Optimization -- Serverless compute (pay only for execution), job clusters (auto-terminating), spot instance support

For complete JSON examples of Job activity, linked service, and pipeline configurations, see references/databricks-job-examples.md.

Connectors and Enhancements (2025+)

ServiceNow V2 Connector (RECOMMENDED - V1 End of Support)

ServiceNow V1 connector is at End of Support. Migrate to V2 immediately.

Feature V1 V2
Linked service type ServiceNow ServiceNowV2
Source type ServiceNowSource ServiceNowV2Source
Query builder Custom Aligns with ServiceNow condition builder
Performance Standard Enhanced extraction
OData support No Yes

Migration steps: Update linked service type to ServiceNowV2, update source type to ServiceNowV2Source, test queries in ServiceNow UI condition builder, adjust timeouts.

Enhanced PostgreSQL Connector

Improved performance with 2025 SSL enhancements: enableSsl: true, sslMode: "Require".

Enhanced Snowflake Connector

Improved performance with KeyPair authentication support and Key Vault secret integration.

Managed Identity for Azure Storage

New managed identity support for Azure Table Storage and Azure Files connectors (system-assigned and user-assigned).

Mapping Data Flows - Spark 3.3

Spark 3.3 now powers Mapping Data Flows with 30% faster processing, Adaptive Query Execution (AQE), dynamic partition pruning, improved caching, and better column statistics.

Azure DevOps Server 2022 Support

Git integration now supports on-premises Azure DevOps Server 2022 via the hostName property.

For complete JSON examples of all connectors, see references/connector-examples.md.

Managed Identity 2025 Best Practices

User-Assigned vs System-Assigned

Scenario Recommendation
Single ADF, simple setup System-assigned
Multiple data factories User-assigned (shared identity)
Complex multi-environment User-assigned
Granular permission control User-assigned
Identity lifecycle independence User-assigned

Use ADF's centralized Credentials feature to consolidate Microsoft Entra ID-based credentials across multiple linked services.

MFA Enforcement (Enforced Since October 2025)

Azure MFA is mandatory for all interactive user logins. Impact on ADF:

  • Managed identities are UNAFFECTED -- no MFA required for service accounts
  • Service principals with certificate auth are the recommended alternative to secrets
  • All interactive user logins require MFA

Principle of Least Privilege

Resource Source Role Sink Role
Storage Blob Storage Blob Data Reader Storage Blob Data Contributor
SQL Database db_datareader db_datareader + db_datawriter
Key Vault Get secrets only Get secrets only

For complete managed identity JSON examples, see references/connector-examples.md.

Best Practices (2025)

  1. Use Databricks Job Activity (MANDATORY) -- Stop using Notebook, Python, JAR activities. Define workflows in Databricks workspace with serverless compute.

  2. Managed Identity Authentication (MANDATORY) -- Use managed identities for ALL Azure resources. Leverage Credentials feature for consolidation. MFA-compliant since October 2025.

  3. Monitor Job Execution -- Track Databricks Job run IDs from ADF output, log parameters for auditability, set up alerts for failures, leverage built-in lineage.

  4. Optimize Spark 3.3 (Data Flows) -- Enable AQE, use 4-8 partitions per core, broadcast joins for small dimensions, dynamic partition pruning.

Resources

Progressive Disclosure References

  • Databricks Job Examples: references/databricks-job-examples.md - Complete JSON for Job activity, linked services, pipeline, and Databricks workspace job definition
  • Connector Examples: references/connector-examples.md - Complete JSON for ServiceNow V2, PostgreSQL, Snowflake, Azure Storage MI, Mapping Data Flows, and Azure DevOps Server

Expand your agent's capabilities with these related and highly-rated skills.

JosiahSiegel/claude-plugin-marketplace

opentofu-guide

Comprehensive OpenTofu expertise including migration from Terraform, state encryption, OpenTofu 1.10/1.11 features (OCI registry, native S3 locking, ephemeral resources, enabled meta-argument), and CI/CD integration. Covers when to use OpenTofu vs Terraform with decision matrix.

22 3
Explore
JosiahSiegel/claude-plugin-marketplace

terraform-tasks

Specialized Terraform task execution skill for autonomous infrastructure operations. Handles code generation, debugging, version management (1.10-1.14+), security scanning, and architecture design across all providers (AWS 6.0, AzureRM 4.x, GCP) and platforms. Covers ephemeral values, Terraform Stacks, policy-as-code, and 2025 best practices.

22 3
Explore
JosiahSiegel/claude-plugin-marketplace

shellcheck-cicd-2025

ShellCheck validation as non-negotiable 2025 workflow practice

22 3
Explore
JosiahSiegel/claude-plugin-marketplace

bash-master

Expert bash/shell scripting system across ALL platforms. PROACTIVELY activate for: (1) ANY bash/shell script task, (2) System automation, (3) DevOps/CI/CD scripts, (4) Build/deployment automation, (5) Script review/debugging, (6) Converting commands to scripts. Provides: Google Shell Style Guide compliance, ShellCheck validation, cross-platform compatibility (Linux/macOS/Windows/containers), POSIX compliance, security hardening, error handling, performance optimization, testing with BATS, and production-ready patterns. Ensures professional-grade, secure, portable scripts every time.

22 3
Explore
JosiahSiegel/claude-plugin-marketplace

process-substitution-fifos

Process substitution, named pipes (FIFOs), and advanced IPC patterns for efficient bash data streaming (2025)

22 3
Explore
JosiahSiegel/claude-plugin-marketplace

modern-automation-patterns

Modern DevOps and CI/CD automation patterns with containers and cloud (2025)

22 3
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results