Agent skill
sap-hana-ml
SAP HANA Machine Learning Python Client (hana-ml) development skill. Use when: Building ML solutions with SAP HANA's in-database machine learning using Python hana-ml library for PAL/APL algorithms, DataFrame operations, AutoML, model persistence, and visualization. Keywords: hana-ml, SAP HANA, machine learning, PAL, APL, predictive analytics, HANA DataFrame, ConnectionContext, classification, regression, clustering, time series, ARIMA, gradient boosting, AutoML, SHAP, model storage
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/sap-hana-ml
Metadata
Additional technical details for this skill
- version
- 1.1.0
- last verified
- 1764201600
- package version
- 2.22.241011
SKILL.md
SAP HANA ML Python Client (hana-ml)
Package Version: 2.22.241011
Last Verified: 2025-11-27
Table of Contents
Installation & Setup
pip install hana-ml
Requirements: Python 3.8+, SAP HANA 2.0 SPS03+ or SAP HANA Cloud
Quick Start
Connection & DataFrame
from hana_ml import ConnectionContext
# Connect
conn = ConnectionContext(
address='<hostname>',
port=443,
user='<username>',
password='<password>',
encrypt=True
)
# Create DataFrame
df = conn.table('MY_TABLE', schema='MY_SCHEMA')
print(f"Shape: {df.shape}")
df.head(10).collect()
PAL Classification
from hana_ml.algorithms.pal.unified_classification import UnifiedClassification
# Train model
clf = UnifiedClassification(func='RandomDecisionTree')
clf.fit(train_df, features=['F1', 'F2', 'F3'], label='TARGET')
# Predict & evaluate
predictions = clf.predict(test_df, features=['F1', 'F2', 'F3'])
score = clf.score(test_df, features=['F1', 'F2', 'F3'], label='TARGET')
APL AutoML
from hana_ml.algorithms.apl.classification import AutoClassifier
# Automated classification
auto_clf = AutoClassifier()
auto_clf.fit(train_df, label='TARGET')
predictions = auto_clf.predict(test_df)
Model Persistence
from hana_ml.model_storage import ModelStorage
ms = ModelStorage(conn)
clf.name = 'MY_CLASSIFIER'
ms.save_model(model=clf, if_exists='replace')
Core Libraries
PAL (Predictive Analysis Library)
- 100+ algorithms executed in-database
- Categories: Classification, Regression, Clustering, Time Series, Preprocessing
- Key classes:
UnifiedClassification,UnifiedRegression,KMeans,ARIMA - See:
references/PAL_ALGORITHMS.mdfor complete list
APL (Automated Predictive Library)
- AutoML capabilities with automatic feature engineering
- Key classes:
AutoClassifier,AutoRegressor,GradientBoostingClassifier - See:
references/APL_ALGORITHMS.mdfor details
DataFrames
- Lazy evaluation - builds SQL until
collect()called - In-database processing for optimal performance
- See:
references/DATAFRAME_REFERENCE.mdfor complete API
Visualizers
- EDA plots, model explanations, metrics
- SHAP integration for model interpretability
- See:
references/VISUALIZERS.mdfor 14 visualization modules
Common Patterns
Train-Test Split
from hana_ml.algorithms.pal.partition import train_test_val_split
train, test, val = train_test_val_split(
data=df,
training_percentage=0.7,
testing_percentage=0.2,
validation_percentage=0.1
)
Feature Importance
# APL models
importance = auto_clf.get_feature_importances()
# PAL models
from hana_ml.algorithms.pal.preprocessing import FeatureSelection
fs = FeatureSelection()
fs.fit(train_df, features=features, label='TARGET')
Pipeline
from hana_ml.algorithms.pal.pipeline import Pipeline
from hana_ml.algorithms.pal.preprocessing import Imputer, FeatureNormalizer
pipeline = Pipeline([
('imputer', Imputer(strategy='mean')),
('normalizer', FeatureNormalizer()),
('classifier', UnifiedClassification(func='RandomDecisionTree'))
])
Best Practices
- Use lazy evaluation - Operations build SQL without execution until
collect() - Leverage in-database processing - Keep data in HANA for performance
- Use Unified interfaces - Consistent APIs across algorithms
- Save models - Use
ModelStoragefor persistence - Explain predictions - Use SHAP explainers for interpretability
- Monitor AutoML - Use
PipelineProgressStatusMonitorfor long-running jobs
Bundled Resources
Reference Files
-
references/DATAFRAME_REFERENCE.md(479 lines)- ConnectionContext API, DataFrame operations, SQL generation
-
references/PAL_ALGORITHMS.md(869 lines)- Complete PAL algorithm reference (100+ algorithms)
- Classification, Regression, Clustering, Time Series, Preprocessing
-
references/APL_ALGORITHMS.md(534 lines)- AutoML capabilities, automated feature engineering
- AutoClassifier, AutoRegressor, GradientBoosting classes
-
references/VISUALIZERS.md(704 lines)- 14 visualization modules (EDA, SHAP, metrics, time series)
- Plot types, configuration, export options
-
references/SUPPORTING_MODULES.md(626 lines)- Model storage, spatial analytics, graph algorithms
- Text mining, statistics, error handling
Error Handling
from hana_ml.ml_exceptions import Error
try:
clf.fit(train_df, features=features, label='TARGET')
except Error as e:
print(f"HANA ML Error: {e}")
Documentation
- Official Docs: https://help.sap.com/doc/1d0ebfe5e8dd44d09606814d83308d4b/2.0.07/en-US/hana_ml.html
- PyPI Package: https://pypi.org/project/hana-ml/
Didn't find tool you were looking for?