TY - JOUR
T1 - Automatic Encoding of Unlabeled Two Dimensional Data Enabling Similarity Searches: Electron Diffusion Regions and Auroral Arcs
AU - Smith, A. W.
AU - Rae, Jonathan
AU - Stawarz, Julia
AU - Sun, W. J.
AU - Bentley, Sarah
AU - Koul, A.
N1 - Funding information: We would like to thank SpaceML for the inspiration behind the work (Koul et al., 2020). Further, we thank Clausen and Nickisch (2018) for their diligent processing, labeling and provision of THEMIS all sky image data. We would also like to thank Lightly (Susmelj et al., 2020) for their informative tutorials (https://docs.lightly.ai/self-supervised-learning/index.html). AWS was supported by NERC Independent Research Fellowship NE/W009129/1. JES was supported by the Royal Society University Research Fellowship URF/R1/201286. WJS was supported by NASA Grant 80NSSC21K0517.
PY - 2024/1
Y1 - 2024/1
N2 - Critically important phenomena in Earth’s magnetosphere often occur briefly, or in small spatial regions. These processes are sampled with orbiting spacecraft or by fixed ground observatories and so rarely appear in data. Identifying such intervals can be an incredibly time consuming task. We apply a novel, powerful method by which two dimensional data can be automatically processed and embeddings created that contain key features of the data. The distance between embedding vectors serves as a measure of similarity. We apply the state-of-the-art method to two example datasets: MMS electron velocity distributions and auroral all sky images. We show that the technique creates embeddings that group together visually similar observations. When provided with novel example images the method correctly identifies similar intervals: when provided with an electron distribution sampled during an encounter with an electron diffusion region the method recovers similar distributions obtained during two other known diffusion region encounters. Similarly, when provided with an interesting auroral structure the method highlights the same structure observed from an adjacent location and at other close time intervals. The method promises to be a useful tool to expand interesting case studies to multiple events, without requiring manual data labeling. Further, the models could be fine-tuned with relatively small set of labeled example data to perform tasks such as classification. The embeddings can also be used as input to deep learning models, providing a key intermediary step—capturing the key features within the data.
AB - Critically important phenomena in Earth’s magnetosphere often occur briefly, or in small spatial regions. These processes are sampled with orbiting spacecraft or by fixed ground observatories and so rarely appear in data. Identifying such intervals can be an incredibly time consuming task. We apply a novel, powerful method by which two dimensional data can be automatically processed and embeddings created that contain key features of the data. The distance between embedding vectors serves as a measure of similarity. We apply the state-of-the-art method to two example datasets: MMS electron velocity distributions and auroral all sky images. We show that the technique creates embeddings that group together visually similar observations. When provided with novel example images the method correctly identifies similar intervals: when provided with an electron distribution sampled during an encounter with an electron diffusion region the method recovers similar distributions obtained during two other known diffusion region encounters. Similarly, when provided with an interesting auroral structure the method highlights the same structure observed from an adjacent location and at other close time intervals. The method promises to be a useful tool to expand interesting case studies to multiple events, without requiring manual data labeling. Further, the models could be fine-tuned with relatively small set of labeled example data to perform tasks such as classification. The embeddings can also be used as input to deep learning models, providing a key intermediary step—capturing the key features within the data.
UR - https://www.scopus.com/pages/publications/85183931689
U2 - 10.1029/2023ja032096
DO - 10.1029/2023ja032096
M3 - Article
SN - 2169-9402
VL - 129
JO - Journal of Geophysical Research: Space Physics
JF - Journal of Geophysical Research: Space Physics
IS - 1
M1 - e2023JA032096
ER -