Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification

Abstract

Light curves serve as a valuable source of information on stellar formationand evolution. With the rapid advancement of machine learning techniques, itcan be effectively processed to extract astronomical patterns and information.In this study, we present a comprehensive evaluation of deep-learning and largelanguage model (LLM) based models for the automatic classification of variablestar light curves, based on large datasets from the Kepler and K2 missions.Special emphasis is placed on Cepheids, RR Lyrae, and eclipsing binaries,examining the influence of observational cadence and phase distribution onclassification precision. Employing AutoDL optimization, we achieve strikingperformance with the 1D-Convolution+BiLSTM architecture and the SwinTransformer, hitting accuracies of 94\% and 99\% correspondingly, with thelatter demonstrating a notable 83\% accuracy in discerning the elusive Type IICepheids-comprising merely 0.02\% of the total dataset.We unveil StarWhisperLightCurve (LC), an innovative Series comprising three LLM-based models: LLM,multimodal large language model (MLLM), and Large Audio Language Model (LALM).Each model is fine-tuned with strategic prompt engineering and customizedtraining methods to explore the emergent abilities of these models forastronomical data. Remarkably, StarWhisper LC Series exhibit high accuraciesaround 90\%, significantly reducing the need for explicit feature engineering,thereby paving the way for streamlined parallel data processing and theprogression of multifaceted multimodal models in astronomical applications. Thestudy furnishes two detailed catalogs illustrating the impacts of phase andsampling intervals on deep learning classification accuracy, showing that asubstantial decrease of up to 14\% in observation duration and 21\% in samplingpoints can be realized without compromising accuracy by more than 10\%.

Quick Read (beta)

loading the full paper ...