: Evaluation of deep learning architectures (like 3D CNNs or Transformers) on their ability to recognize temporal patterns in these specific, challenging scenarios. Why this file is used
: Identifying "missing" human-object interactions (HOI) in short video sequences where the interaction might be obscured or brief.
: Comparing the difficulty of ShoStMiHN to older datasets like UCF101 or HMDB51.
: Understanding how actions evolve over time.
While the exact title can vary slightly between pre-prints and published versions, the paper typically focuses on:











