Onehouse is an open-source cloud-native managed data lakehouse service that simplifies building and managing data lakes. It provides a self-managing data layer that automates data ingestion, management, and optimization for faster processing.
Onehouse combines aspects of data warehouses and data lakes, enabling companies to store and query large volumes of structured and unstructured data cost-effectively. It supports incremental data processing, which allows minute-level data freshness, eliminating the need for slow batch ETL processes. Onehouse integrates with popular data engines like Apache Spark, Trino, and Presto, allowing users to query data using their preferred tools.
The platform provides a central storage solution with the flexibility to access data using different table formats, including Apache Hudi, Apache Iceberg, and Delta Lake, through its proprietary XTable technology. Onehouse's storage platform features automated data management capabilities such as file sizing, partitioning, clustering, catalog syncing, indexing, and caching. It supports query engines like Snowflake, Databricks, Redshift, BigQuery, EMR, Spark, Presto, and Trino, allowing users to analyze and query data using their preferred tools.
In August 2024, the company launched a vector embedding generator as part of its managed ELT cloud service to automate embedding pipelines for GenAI applications using foundation models from OpenAI and Voyage AI. The tool continuously delivers data from various sources to AI models, which generate embeddings stored in optimized tables on the user's data lakehouse. It also integrates with vector databases to enable high-scale, low-latency serving for real-time use cases, reducing the time and costs to build vector embeddings while handling update management, late-arriving data, and concurrency control.
Key customers and partnerships
As of October 2024, Onehouse supported data access on Google Cloud Platform and Amazon Web Services.
By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.