Top AI tools for Data Engineer
-
Atlan Find, Trust, and Govern AI-Ready Data
Atlan is a data catalog and governance platform designed to provide a single source of truth for all data assets. It enables efficient data discovery, active data governance, and collaboration across teams.
- Contact for Pricing
-
Aerospike The massively scalable, real-time database for infinite scale, speed, and savings.
Aerospike is a distributed NoSQL database designed for real-time applications, offering millisecond latency, massive scalability, and multi-model capabilities including vector search.
- Contact for Pricing
-
MOSTLY AI Data Access and Data Insights for Everyone
MOSTLY AI is an enterprise-grade synthetic data generation platform that leverages GenAI to create privacy-safe, high-quality synthetic datasets for data sharing, AI/ML development, and analytics.
- Usage Based
- From 3$
-
Linx Integrate systems. Automate work. Anywhere. Fast.
Linx is a low-code iPaaS platform designed for rapid integration and automation of business systems. It offers a flexible and scalable solution for building custom APIs, data migrations, and process automation.
- Paid
- From 599$
-
Weaviate The AI-native database for a new generation of software
Weaviate is an open-source vector database that enables developers to build AI-native applications with improved search capabilities, reduced hallucination, and enhanced data security. It supports hybrid search, RAG, and generative feedback loops.
- Freemium
- From 25$
-
Osmos Streamline Your Data Ingestion with Gen AI
Osmos is an AI-powered platform designed to streamline data ingestion by automating the cleaning, mapping, and transformation of data from various sources into operational systems.
- Freemium
- From 500$
-
Lilac Better data, better AI - Search, quantify and edit data for LLMs
Lilac is a powerful data platform that enables efficient dataset exploration, quality control, and management for Large Language Models (LLMs). It offers fast dataset computations and advanced clustering capabilities for AI data processing.
- Contact for Pricing
-
Minexa.ai Turn any web page into structured data with AI-powered extraction
Minexa.ai is an all-in-one AI-powered web scraping platform that transforms web pages into structured data without complex coding or maintenance, offering universal data extraction at scale.
- Freemium
- From 75$
-
Deus.ai Empowering organisations to advance value creation with data and artificial intelligence.
Deus.ai partners with organizations to leverage data and AI for transformative value creation across business, people, and society.
- Contact for Pricing
-
Apache Samza A distributed stream processing framework
Apache Samza is a distributed stream processing framework that allows you to build stateful applications for real-time data processing from multiple sources.
- Free
-
Hopsworks The AI Lakehouse for Your Data
Hopsworks is an MLOps platform and feature store that enables organizations to build, deploy, and manage AI systems with reproducibility, consistency, and scalability. It offers a unified solution for GenAI, real-time applications, and traditional machine learning.
- Freemium
-
Neurelo AI-powered Data Access Platform for Modern App Development
Neurelo is an AI-powered platform that simplifies data integration and accelerates app development by automating database tasks and API creation.
- Freemium
- From 25$
-
Codeanywhere The AI Cloud IDE
Codeanywhere is an AI-powered cloud IDE designed to accelerate development by providing instant, preconfigured environments, AI code assistance, and seamless collaboration features. Start coding faster with AI-driven code completion and problem-solving capabilities.
- Freemium
- From 10$
-
Ocient Hyperscale Data Warehouse Real-time analysis of complex, hyperscale datasets with 90% reduced energy consumption
Ocient is a hyperscale data warehouse platform that delivers real-time analytics and OLAP workloads with integrated machine learning capabilities, designed for maximum performance while reducing costs and energy consumption.
- Contact for Pricing
-
Aggregations.io Real Time Metrics + Automated Documentation using your existing data pipeline.
Aggregations.io transforms existing event data into real-time metrics and automatically generates searchable event schema documentation. It enhances observability and data-driven product features without requiring SDKs.
- Free Trial
- From 60$
-
Evidently AI Collaborative AI observability platform for evaluating, testing, and monitoring AI-powered products
Evidently AI is a comprehensive AI observability platform that helps teams evaluate, test, and monitor LLM and ML models in production, offering data drift detection, quality assessment, and performance monitoring capabilities.
- Freemium
- From 50$
-
Apache Spark Unified Engine for Large-Scale Data Analytics
Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.
- Free
-
Kestra Powerful orchestration. Simplified workflows.
Kestra is an open-source orchestration platform designed to unify workflows for all engineers, simplifying development and management through a declarative, language-agnostic approach with both UI and code-based interfaces.
- Freemium
-
CocoIndex Extract, Transform, Index Data. Easy and Fresh.
CocoIndex is an open-source engine specializing in data indexing with support for custom transformation logic and incremental updates.
- Free
-
Infoworks Simplify, Automate, Accelerate Cloud Data Operations
Infoworks provides a unified platform that automates end-to-end cloud data migration and data operations, accelerating the delivery of AI, ML, and analytics use cases.
- Freemium
-
Modal Serverless Cloud for AI, ML, and Data Applications
Modal provides high-performance, serverless cloud infrastructure optimized for AI, ML, and data applications. It offers rapid container starts, seamless autoscaling, and flexible environments for developers.
- Usage Based
-
InfinyOn Transform your data into real-time intelligence.
InfinyOn is a next-generation streaming platform that unifies data flows, accelerates processing with Rust and WebAssembly, and empowers developers to build responsive, intelligent applications at any scale.
- Usage Based
-
Airbyte Your data. AI actionable anywhere.
Airbyte is an open-source data integration platform that helps organizations consolidate and move data across multi-cloud environments, supporting 550+ data sources and destinations with AI-ready capabilities.
- Freemium
- From 10$
-
Grafbase The New Standard to Scale GraphQL Federation
Grafbase is an enterprise-grade GraphQL Federation platform offering self-hosted deployment, high performance, and AI agent integration for unified API management.
- Free Trial
- From 2999$
-
LakeSail Big Data Processing for the AI Era
LakeSail's Sail is an open-source computation framework that unifies batch processing, stream processing, and compute-intensive AI workloads, offering 4x processing speed and 94% lower hardware costs compared to Apache Spark.
- Freemium
-
Chadview Real-time ChatGPT-powered meetings assistant for job interviews
Chadview is a browser extension that provides real-time AI assistance during job interviews on Zoom, Google Meet, and Microsoft Teams by listening to conversations and generating instant answers to technical questions.
- Freemium
- From 20$
-
VerbaGPT Talk to your data without compromising privacy
VerbaGPT is a data analytics platform that enables natural language interactions with CSV and SQL data while maintaining data privacy. It allows users to perform complex queries, create visualizations, and build ML models without exposing sensitive data to external LLMs.
- Contact for Pricing
-
Navera IaC Engine Data Infrastructure as Code (made simple)
Navera IaC Engine simplifies and optimizes Infrastructure as Code (IaC) practices for AI data infrastructure, offering abstracted deployments, centralized visibility, and automated workflows.
- Freemium
-
Flexor Transform Unstructured Data Into Valuable Insights
Flexor is a SQL-first platform that transforms textual data into structured, LLM-ready formats. It simplifies unstructured data preparation, ensuring accuracy, scalability, and governance.
- Contact for Pricing
-
Telmai Accelerate your AI with trusted data.
Telmai offers a comprehensive data observability platform for data lakes and lakehouses, utilizing AI to ensure data quality and reliability without sampling or increasing cloud costs.
- Contact for Pricing
-
Wherobots The Spatial Intelligence Cloud for Planetary-Scale Analytics
Wherobots is a comprehensive spatial data platform that combines ETL, analytics, and AI capabilities for processing geospatial data at scale, created by the original developers of Apache Sedona.
- Freemium
-
Doflow A visual workflow builder for Google Cloud Workflows powered by AI
Doflow is an AI-powered visual workflow builder that enables users to combine Google Cloud services and APIs to create reliable applications, process automation, and data/ML pipelines with ease.
- Freemium
- From 10$
-
Securiti Enabling Safe Use of Data & AI
Securiti provides a unified Data Command Center to enable safe use of data and AI, offering intelligence, controls, and orchestration across hybrid multicloud environments.
- Contact for Pricing
-
Redis The fast memory layer for real-time data and AI applications.
Redis is a high-performance in-memory data structure store used as a database, cache, message broker, and vector database to accelerate application performance and enable real-time AI.
- Freemium
- From 5$
-
Isomeric Transform messy, unstructured text into machine readable JSON
Isomeric is an AI-powered data extraction platform that converts unstructured text into structured JSON format, enabling efficient data gathering from websites, documents, and various text sources.
- Paid
- From 149$
-
Hevo Data Automate Data Replication with No-Code ELT
Hevo Data is a fully managed, no-code ELT platform designed to automate data replication from over 150 sources into data warehouses, preparing data for analytics and AI.
- Freemium
- From 599$
-
Cloud Data Connect AI-Powered Data Quality, Matching, Standardization, and Enrichment
Cloud Data Connect by Interzoid offers AI-driven data matching, standardization, and enrichment for databases and files. Enhance data quality and usability with a few clicks.
- Usage Based
-
Tecton An easier and faster way to productionize data for AI
Tecton is an AI data platform that helps teams turn structured and unstructured data into AI context for better models, automating data pipelines and reducing time to production by 80%.
- Contact for Pricing
-
TakeShape Data-First Agent Builder & Runtime
TakeShape is a data-first agent builder and runtime platform that connects, transforms, and activates fragmented enterprise data. It enables businesses to create powerful AI-driven Agents for immediate business impact.
- Free Trial
- From 500$
-
FiftyOne A refinery for data and models to build production-ready visual AI applications
FiftyOne is an enterprise-grade platform for managing, visualizing, and refining visual AI datasets and models, enabling efficient development of computer vision applications at scale.
- Freemium
-
Hex Bring everyone together with data
Hex is an AI-powered collaborative workspace designed to streamline data workflows by integrating queries, scripts, and interactive reporting.
- Freemium
- From 36$
- API
-
Zuar The low-code solution for serving fast, compelling analytics to your customers.
Zuar provides a low-code platform featuring AI integration for building data portals and automating data pipelines, enabling businesses to deliver unified and actionable analytics.
- Free Trial
-
DataCebo SDV The world's first and most powerful system of generative models for tabular data
DataCebo SDV is an enterprise-grade synthetic data generation platform, founded at MIT, that enables organizations to create and manage their own generative AI applications for tabular data.
- Freemium
-
Cleanlab Reliable AI Management Platform
Cleanlab is a management platform for Reliable AI, enabling businesses to detect, observe, and resolve AI failures in real time, ensuring trust in RAG and Agentic AI systems.
- Free Trial
-
pathway.com Powering your RAG and ETL at scale with Live Data
Pathway is a scalable data processing framework that enables you to build and power AI/ML applications with live data and real-time pipelines. It offers easy data ingest from 300+ sources and supports real-time features, live vector search, and anomaly alerts.
- Freemium
- From 499$
-
QuantHub Making Learning Data Skills Easy
QuantHub is an AI-powered platform that provides microlearning training for digital literacy and data fluency. It offers personalized learning journeys to help individuals and organizations upskill in data.
- Free Trial
-
Practicus AI The Unified Platform for Generative AI and Data Intelligence
Practicus AI is a comprehensive platform for building and deploying generative AI models and data intelligence solutions. It offers a unified environment for data science, analytics, and observability, with deployment options across cloud, on-premises, and air-gapped networks.
- Freemium
-
Arkle Generative AI for Powering Enterprises
Arkle is a generative AI platform designed to empower enterprises with automated workflows, no-code apps, intelligent AI assistants, and actionable insights.
- Free Trial
- From 49$
-
MovingLake The #1 Real-Time API Integration Company
MovingLake offers real-time API integration solutions for enterprises, connecting various systems like ERPs, CRMs, and databases to ensure up-to-date data flow across the organization.
- Paid
-
Zparse No-Code AI Platform for Effortless Data Transformation
Zparse is an AI-powered platform that enables businesses to extract, clean, transform, and move data seamlessly without coding or technical expertise.
- Freemium
Featured Tools
Join Our Newsletter
Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.
Explore More Professions
Didn't find tool you were looking for?