Agent skill
file-processing
Data file processing utilities for CSV, JSON, and text files. Provides helpers for reading, transforming, and validating structured data.
Install this agent skill to your Project
npx add-skill https://github.com/baidu-baige/LoongFlow/tree/main/.claude/skills/file-processing
SKILL.md
File Processing Skill
This skill provides utilities and guidance for building robust file processing applications.
Purpose
Use this skill when your task involves:
- Reading and parsing CSV, JSON, or text files
- Data validation and cleaning
- File format conversions
- Batch processing of multiple files
- Generating reports from data files
Key Capabilities
1. Data Reading
- CSV parsing with header detection
- JSON file handling (single object or array)
- Text file processing line-by-line
- Error handling for malformed files
2. Data Validation
- Check for required fields
- Validate data types
- Handle missing values
- Report data quality issues
3. Data Transformation
- Filter rows based on conditions
- Calculate statistics (sum, avg, count)
- Format conversions
- Data aggregation
4. Output Generation
- Write processed data to new files
- Generate summary reports
- Create multiple output formats
Best Practices
Project Structure for File Processing
project/
├── main.py # Entry point with CLI
├── file_reader.py # File I/O operations
├── data_processor.py # Core processing logic
├── validator.py # Data validation
├── config.py # Configuration constants
└── utils.py # Helper functions
Error Handling Pattern
def read_file_safely(filepath):
"""Read file with proper error handling"""
try:
if not os.path.exists(filepath):
raise FileNotFoundError(f"File not found: {filepath}")
with open(filepath, 'r', encoding='utf-8') as f:
return f.read()
except Exception as e:
print(f"Error reading file: {e}")
return None
CSV Processing Template
import csv
def process_csv(input_file, output_file):
"""Process CSV with header detection"""
with open(input_file, 'r', encoding='utf-8') as f:
reader = csv.DictReader(f)
processed = []
for row in reader:
# Transform each row
processed_row = transform_row(row)
processed.append(processed_row)
# Write results
with open(output_file, 'w', encoding='utf-8') as f:
if processed:
writer = csv.DictWriter(f, fieldnames=processed[0].keys())
writer.writeheader()
writer.writerows(processed)
JSON Processing Template
import json
def process_json(input_file, output_file):
"""Process JSON data"""
with open(input_file, 'r', encoding='utf-8') as f:
data = json.load(f)
# Process data (handle both list and dict)
processed = process_data(data)
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(processed, f, indent=2, ensure_ascii=False)
Common Patterns
1. CLI with Argument Parsing
import argparse
def main():
parser = argparse.ArgumentParser(description='File Processor')
parser.add_argument('input', help='Input file path')
parser.add_argument('output', help='Output file path')
parser.add_argument('--format', choices=['csv', 'json'], default='csv')
args = parser.parse_args()
process_file(args.input, args.output, args.format)
2. Batch Processing
import glob
def process_directory(input_dir, output_dir, pattern='*.csv'):
"""Process all matching files in directory"""
files = glob.glob(os.path.join(input_dir, pattern))
for filepath in files:
filename = os.path.basename(filepath)
output_path = os.path.join(output_dir, f"processed_{filename}")
process_file(filepath, output_path)
3. Progress Reporting
def process_with_progress(items):
"""Process items with progress feedback"""
total = len(items)
for i, item in enumerate(items, 1):
process_item(item)
print(f"Progress: {i}/{total} ({i*100//total}%)", end='\r')
print() # New line when complete
Tools Available
When implementing file processing tasks, you have access to:
Read- Read file contentsWrite- Create new filesEdit- Modify existing filesGlob- Find files by patternBash- Run shell commands (e.g.,wc -l,head)
Testing Tips
Always test your file processor with:
- Empty files - Should handle gracefully
- Malformed data - CSV with wrong column count, invalid JSON
- Missing files - Should provide clear error messages
- Large files - Consider memory usage
- Special characters - Unicode, newlines in CSV fields
Example Task Breakdown
Task: "Create a CSV analyzer that calculates statistics"
Suggested Steps:
- Read CSV file and detect headers
- Parse data into structured format
- Calculate statistics (count, sum, average) per column
- Generate summary report
- Write results to output file
Recommended Structure:
csv_analyzer.py- Main programstats.py- Statistics calculationsreport_generator.py- Format output
References
For more complex tasks, consider:
- Python's
csvmodule for CSV handling jsonmodule for JSON operationspathlibfor cross-platform file pathspandasfor advanced data processing (if allowed)
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
skill-creator
Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations.
code-analysis
Code review and debugging assistant. Identifies bugs, performance issues, security vulnerabilities, and suggests optimizations.
loongflow
PEES (Plan-Execute-Evaluate-Summary) iterative problem-solving methodology with LoongFlow engine for complex tasks. Use when tasks need structured iteration, optimization, evolution, or when user mentions loongflow/PEES/PES.
ubiquitous-language
Extract a DDD-style ubiquitous language glossary from the current conversation, flagging ambiguities and proposing canonical terms. Saves to UBIQUITOUS_LANGUAGE.md. Use when user wants to define domain terms, build a glossary, harden terminology, create a ubiquitous language, or mentions "domain model" or "DDD".
every-style-editor
This skill should be used when reviewing or editing copy to ensure adherence to Every's style guide. It provides a systematic line-by-line review process for grammar, punctuation, mechanics, and style guide compliance.
manage-codex
Autonomous Codex batch orchestrator. Use for "/manage-codex", "manage codex", "use codex", "dispatch to codex", or long-running Codex work.
Didn't find tool you were looking for?