NL2KQL: From Natural Language to Kusto Query

  • 2024-04-15 23:10:17
  • Amir H. Abdi, Xinye Tang, Jeremias Eichelbaum, Mahan Das, Alex Klein, Nihal Irmak Pakis, William Blum, Daniel L Mace, Tanvi Raja, Namrata Padmanabhan, Ye Xing
  • 0

Abstract

Data is growing rapidly in volume and complexity. Proficiency in databasequery languages is pivotal for crafting effective queries. As coding assistantsbecome more prevalent, there is significant opportunity to enhance databasequery languages. The Kusto Query Language (KQL) is a widely used query languagefor large semi-structured data such as logs, telemetries, and time-series forbig data analytics platforms. This paper introduces NL2KQL an innovativeframework that uses large language models (LLMs) to convert natural languagequeries (NLQs) to KQL queries. The proposed NL2KQL framework includes severalkey components: Schema Refiner which narrows down the schema to its mostpertinent elements; the Few-shot Selector which dynamically selects relevantexamples from a few-shot dataset; and the Query Refiner which repairs syntacticand semantic errors in KQL queries. Additionally, this study outlines a methodfor generating large datasets of synthetic NLQ-KQL pairs which are valid withina specific database contexts. To validate NL2KQL's performance, we utilize anarray of online (based on query execution) and offline (based on query parsing)metrics. Through ablation studies, the significance of each framework componentis examined, and the datasets used for benchmarking are made publiclyavailable. This work is the first of its kind and is compared with availablebaselines to demonstrate its effectiveness.

 

Quick Read (beta)

loading the full paper ...