Agent skill

chuukese-language-processing

Specialized processing for Chuukese language text including tokenization, accent handling, cultural context preservation, and language-specific patterns. Use when working with Chuukese text, translation tasks, or when building language models for this Micronesian language.

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/data/chuukese-language-processing

SKILL.md

Chuukese Language Processing

Overview

A specialized skill for processing Chuukese language text, focusing on proper handling of accented characters, cultural context preservation, and language-specific linguistic patterns. Essential for building accurate translation systems and language models for this low-resource Micronesian language.

Capabilities

  • Accent Character Normalization: Proper handling of Chuukese diacritical marks (á, é, í, ó, ú, ā, ē, ī, ō, ū)
  • Cultural Context Preservation: Maintain traditional concepts and cultural nuances
  • Phonetic Pattern Recognition: Understanding of Chuukese sound patterns and phonology
  • Morphological Analysis: Basic word formation and grammatical structure recognition
  • Dictionary Integration: Seamless integration with Chuukese-English dictionaries
  • Translation Quality Assessment: Validation of translation accuracy and cultural appropriateness

Core Components

1. Chuukese Text Normalization

python
import re
import unicodedata

class ChuukeseTextProcessor:
    def __init__(self):
        self.accent_patterns = {
            'acute': ['á', 'é', 'í', 'ó', 'ú'],
            'macron': ['ā', 'ē', 'ī', 'ō', 'ū'],
            'base': ['a', 'e', 'i', 'o', 'u']
        }
        
        self.normalize_map = {
            'á': 'á', 'à': 'á', 'â': 'á',  # Standardize to acute
            'ā': 'ā', 'ă': 'ā',           # Standardize to macron
            'é': 'é', 'è': 'é', 'ê': 'é',
            'ē': 'ē', 'ĕ': 'ē',
            'í': 'í', 'ì': 'í', 'î': 'í',
            'ī': 'ī', 'ĭ': 'ī',
            'ó': 'ó', 'ò': 'ó', 'ô': 'ó',
            'ō': 'ō', 'ŏ': 'ō',
            'ú': 'ú', 'ù': 'ú', 'û': 'ú',
            'ū': 'ū', 'ŭ': 'ū'
        }
    
    def normalize_chuukese_text(self, text):
        """Normalize Chuukese text with proper accent handling"""
        # First apply Unicode normalization
        normalized = unicodedata.normalize('NFC', text)
        
        # Then apply Chuukese-specific normalization
        for variant, standard in self.normalize_map.items():
            normalized = normalized.replace(variant, standard)
        
        return normalized

2. Cultural Context Recognition

python
class ChuukeseCulturalProcessor:
    def __init__(self):
        self.cultural_concepts = {
            'family_terms': ['semei', 'jinej', 'seme', 'jina', 'pwis', 'pwisen'],
            'traditional_items': ['emon', 'uruf', 'nous', 'ruk', 'chomw'],
            'respect_terms': ['oupwe', 'kose mochen', 'tipeew', 'sokkun'],
            'time_concepts': ['ranem', 'ekis', 'ngang', 'pwong'],
            'spatial_terms': ['met', 'ese', 'won', 'ifa']
        }
    
    def detect_cultural_context(self, text):
        """Detect cultural context indicators in Chuukese text"""
        context = {
            'cultural_density': 0,
            'respect_level': 'casual',
            'traditional_concepts': [],
            'formality_indicators': []
        }
        
        for category, terms in self.cultural_concepts.items():
            found_terms = [term for term in terms if term in text.lower()]
            if found_terms:
                context['traditional_concepts'].extend(found_terms)
                context['cultural_density'] += len(found_terms)
        
        return context

Usage Examples

Basic Text Processing

python
# Initialize processor
processor = ChuukeseTextProcessor()

# Process Chuukese text
text = "Kopwe pwan chomong ngonuk ekkewe chon Chuuk"
normalized = processor.normalize_chuukese_text(text)
words = processor.extract_chuukese_words(text)

print(f"Normalized: {normalized}")
print(f"Words: {words}")

Cultural Context Analysis

python
# Analyze cultural context
cultural_processor = ChuukeseCulturalProcessor()
context = cultural_processor.detect_cultural_context(text)

print(f"Cultural density: {context['cultural_density']}")
print(f"Traditional concepts: {context['traditional_concepts']}")

Best Practices

Text Processing

  1. Always normalize: Apply Unicode and Chuukese-specific normalization
  2. Preserve accents: Maintain diacritical marks for accurate meaning
  3. Context awareness: Consider cultural and social context
  4. Quality validation: Verify processing with native speaker input

Cultural Sensitivity

  1. Respect traditions: Honor traditional concepts and practices
  2. Appropriate register: Use proper formality levels
  3. Community involvement: Engage with Chuukese language community
  4. Continuous learning: Stay updated with language evolution

Dependencies

  • unicodedata: Unicode normalization
  • re: Regular expression pattern matching
  • difflib: Fuzzy string matching
  • csv: Dictionary file processing

Didn't find tool you were looking for?

Be as detailed as possible for better results