Meet Varada's Workload Analyzer
Zero data-ops approach to interactive queries on your data lake.
Varada’s Workload Analyzer is a free and easy to use tool that offers deep actionable insights and unprecedented visibility into analytics workloads running on Presto:
- Resource utilization – get workload-level / cluster analysis on CPU, RAM and I/O utilization patterns.
- Identify heavy spenders – learn which users and tables take up the most resources in terms of CPU and I/O so you can scale accordingly and eliminate bottlenecks.
- Deep insights on workload characteristics – Varada analyzes how your workloads use Presto operators such as Aggregations, JOIN, LIKE and on which data sources they access. Furthermore, you’ll see the impact the usage has on overall performance so you can improve where possible.
For example, you’ll discover where to apply JOIN reordering and where to change JOIN-distribution-type. Deeply understanding the level of selectivity of queries will also enable to apply acceleration technologies.
Workload Analyzer Open Resources
Presto Workload Analysis Sample Report
- Cluster Overview: Facts and figures on the period the Analyzer ran and key usage metrics (data scanned, number of queries).
- Key Findings: Deep dive into how queries are actually running on the cluster. Analysis includes queries elapsed time, distribution by users, Presto operators distribution, resources consumption, and more. Query analysis includes selectivity analysis, joins analysis, and more.
- Identify top users consuming most of the CPU on your cluster
- Identify which query operators/type of queries consume most of your cluster resources (Filtering/Aggregation/Joins)
- Identify which users required further education on improving and optimizing queries, as well as applying performance best practices.
- Identify and improve cohorts/groups of users and apply resource groups policies to enhance overall workload management.
- Optimize how you model chargeback pricing based on actual user consumption levels and patterns
- Easily identify and optimize TopX tables that consume most of your resources, but may account for only a small fraction of the overall data like size
- Apply additional use-case specific optimizations, such as caching, indexing, 3rd party aggregation acceleration tools, etc.
“We haven’t realized just how resource-intensive / expensive some of the common operations are. Selectivity analysis showed 82% of the CPU consumption was spent on ScanFilterAndProject operator (Query Filters) with highly selective filters (less than 5% on avg) ending up with full tables / partitions scans.”
Director of Data Engineering
Computer Software Company
There's a New Standard for Data Virtualization
Meet Varada. With the use of dynamic analysis and adaptive indexing data architects are able to seamlessly accelerate and optimize workloads - resulting in optimal control over performance and cost.