Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Our aim is to enable a machine to observe and interpret the behaviour of others. Mathematical models are employed to describe certain biological motions. The main challenge is to design models that are both tractable and meaningful. In the first part we will describe how computer vision techniques, in particular visual tracking, can be applied to recognize a small vocabulary of human actions in a constrained scenario. Mainly the problems of viewpoint and scale invariance need to be overcome to formalize a general framework. Hence the second part of the article is devoted to the question whether a particular human action should be captured in a single complex model or whether it is more promising to make extensive use of semantic knowledge and a collection of low-level models that encode certain motion primitives. Scene context plays a crucial role if we intend to give a higher-level interpretation rather than a low-level physical description of the observed motion. A semantic knowledge base is used to establish the scene context. This approach consists of three main components: visual analysis, the mapping from vision to language and the search of the semantic database. A small number of robust visual detectors is used to generate a higher-level description of the scene. The approach together with a number of results is presented in the third part of this article.

Original publication




Journal article


Philosophical transactions of the Royal Society of London. Series B, Biological sciences

Publication Date





475 - 490


GE Global Research, One Research Circle, Niskayuna NY 12309, USA.


Humans, Language, Social Behavior, Intention, Social Perception, Space Perception, Time Perception, Visual Perception, Motion Perception, Models, Theoretical, Semantics, Computers, Recognition, Psychology