We present a new solution to the problem of classifying Type Ia supernovae from their light curves alone given a spectroscopically confirmed but biased training set, circumventing the need to obtain an observationally expensive unbiased training set. We use Gaussian processes (GPs) to model the supernovae's (SN's) light curves, and demonstrate that the choice of covariance function has only a small influence on the GPs ability to accurately classify SNe. We extend and improve the approach of Richards et al. - a diffusion map combined with a random forest classifier - to deal specifically with the case of biased training sets. We propose a novel method called Synthetically Augmented Light Curve Classification (STACCATO) that synthetically augments a biased training set by generating additional training data from the fitted GPs. Key to the success of the method is the partitioning of the observations into subgroups based on their propensity score of being included in the training set. Using simulated light curve data, we show that STACCATO increases performance, as measured by the area under the Receiver Operating Characteristic curve (AUC), from 0.93 to 0.96, close to the AUC of 0.977 obtained using the 'gold standard' of an unbiased training set and significantly improving on the previous best result of 0.88. STACCATO also increases the true positive rate for SNIa classification by up to a factor of 50 for high-redshift/low-brightness SNe.

STACCATO: a novel solution to supernova photometric classification with biased training sets / Revsbech, E. A.; Trotta, R.; van Dyk, D. A.. - In: MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY. - ISSN 0035-8711. - 473:3(2018), pp. 3969-3986. [10.1093/MNRAS/STX2570]

STACCATO: a novel solution to supernova photometric classification with biased training sets

Trotta R.
;
2018-01-01

Abstract

We present a new solution to the problem of classifying Type Ia supernovae from their light curves alone given a spectroscopically confirmed but biased training set, circumventing the need to obtain an observationally expensive unbiased training set. We use Gaussian processes (GPs) to model the supernovae's (SN's) light curves, and demonstrate that the choice of covariance function has only a small influence on the GPs ability to accurately classify SNe. We extend and improve the approach of Richards et al. - a diffusion map combined with a random forest classifier - to deal specifically with the case of biased training sets. We propose a novel method called Synthetically Augmented Light Curve Classification (STACCATO) that synthetically augments a biased training set by generating additional training data from the fitted GPs. Key to the success of the method is the partitioning of the observations into subgroups based on their propensity score of being included in the training set. Using simulated light curve data, we show that STACCATO increases performance, as measured by the area under the Receiver Operating Characteristic curve (AUC), from 0.93 to 0.96, close to the AUC of 0.977 obtained using the 'gold standard' of an unbiased training set and significantly improving on the previous best result of 0.88. STACCATO also increases the true positive rate for SNIa classification by up to a factor of 50 for high-redshift/low-brightness SNe.
2018
473
3
3969
3986
Revsbech, E. A.; Trotta, R.; van Dyk, D. A.
File in questo prodotto:
File Dimensione Formato  
stx2570.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.64 MB
Formato Adobe PDF
2.64 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11767/116897
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 24
  • ???jsp.display-item.citation.isi??? 25
social impact