Varada, a provider of Big Data analytics software, has made its Workload Analyzer for Presto open source. The tool is compatible with the query engine Presto, a project originally founded by Facebook, which today consists of PrestoDB and Trino (the former PrestoSQL).
Data virtualization with the distributed query engine
Presto, a distributed SQL query engine, can be used to query data from a variety of sources, including Kafka, MongoDB, PostgreSQL and MySQL. The engine is designed for particularly coarse petabyte data sets and processes data at the storage location – in coarse clusters and cloud environments this plays a role.
Monitor workloads in Presto clusters
The query engine and its associated tool for examining workloads in Presto clusters are currently used primarily in data-driven projects and enterprises, according to Varada. The distributed query engine can be used to query raw data from various sources in an unmodeled fashion, and the entire so-called data lake can then be examined using PrestoDB and Trino, according to the tool vendor.
Vendor says DataOps teams can use Workload Analyzer for Presto to monitor their production pipelines, identify bottlenecks, determine resource requirements on an hourly or weekly basis, and define scaling rules. In productive use, workloads can then apparently be more specifically allocated to the available cloud resources.
About Presto, now a trademark of the Linux Foundation
Facebook had founded Presto in 2012 and placed it under the Apache license in 2013; since 2019, the query engine has been organizationally under the umbrella of the Linux Foundation and has its own foundation: in addition to Facebook, Uber, Twitter and Alibaba were among the founding members of the Presto Foundation at the time.
The recently open source Workload Analyzer for Presto is available for free download on GitHub. More information about the tool and its applications can be found in the press release of Varada.