Bio-PM — Bio-Inspired Self-Supervised Learning for Wrist-worn Accelerometer Data

Abstract

A foundation model that understands the language of movement

Wearable accelerometers enable large-scale health monitoring, yet learning robust human-activity representations has been constrained by scarce labeled data. While self-supervised learning offers a remedy, existing methods treat sensor streams as unstructured time series, overlooking the underlying biological structure of human movement — a factor we argue is critical for effective Human Activity Recognition (HAR).

We introduce a novel tokenization strategy grounded in the submovement theory of motor control, which posits that continuous wrist motion is composed of elementary basis functions called submovements. We define our token as the movement segment, a unit of motion composed of a finite sequence of submovements. By pretraining a Transformer encoder via masked reconstruction of these tokens, we shift the learning focus from local waveform morphology to high-level structural and temporal organization.

Pretrained on the NHANES corpus (≈28k hours; ≈11k participants), our representations outperform strong wearable SSL baselines across six subject-disjoint HAR benchmarks.

Key contributions

i.

Bio-inspired tokenization for wrist IMU

A scalable tokenization strategy that segments continuous accelerometer signals into meaningful movement units using zero-crossings in linear acceleration — the kinematic signature of submovement boundaries.

ii.

Contextual representation learning

Bio-PM, a Transformer-based encoder pretrained via masked movement-segment reconstruction. It models temporal relations between segments and captures the compositional structure of human activity.

iii.

Data-efficient transfer

Movement-segment based pretraining improves label efficiency over SSL baselines: Bio-PM dominates contrastive, augmentation-prediction, and masked-reconstruction baselines on every benchmark we test.

Method

Movement as language.

In natural language, words emerge from compositions of phonemes. In motor control, movement segments emerge from compositions of submovements. We exploit this parallel: parse the accelerometer stream into segments at acceleration zero-crossings, encode each with a small CNN, and let a Transformer reason over the resulting sequence with masked reconstruction.

Bio-PM architecture: pretraining via masked movement-segment reconstruction, transfer via linear probing.

Figure 1. Bio-PM representation learning. We (i) tokenize accelerometry into movement-aligned segments, (ii) pretrain by modeling temporal relations with a Transformer under masked reconstruction, and (iii) transfer the frozen encoder to downstream HAR for linear probing.

Illustration of velocity-based vs acceleration-based movement segments. We use acceleration zero-crossings (type-2 segments) as our tokens.

Figure 2. Illustration of the proposed tokenization strategy, which uses "type 2" movement segments defined via acceleration zero-crossings.

Citation

BibTeX

@inproceedings{tarale2026biopm,
  title     = {Bio-Inspired Self-Supervised Learning for Wrist-worn Accelerometer Data},
  author    = {Tarale, Prithviraj and Chu, Kiet and Varghese, Abhishek and Liu, Kai-Chun and Xu, Maxwell A. and Iyyer, Mohit and Lee, Sunghoon Ivan},
  booktitle = {Proceedings of the 43rd International Conference on Machine Learning (ICML)},
  series    = {PMLR},
  volume    = {306},
  year      = {2026},
}