Adaptive Vision for Human Robot Collaboration


Unstructured social environments, e.g. building sites, release an overwhelming amount of information yet behaviorally relevant variables may be not directly accessible. 
Currently proposed solutions for specific tasks, e.g. autonomous cars, usually employ over redundant, expensive, and computationally demanding sensory systems that attempt to cover the wide set of sensing conditions which the system may have to deal with. 
Adaptive control of the sensors and of the perception process input is a key solution found by nature to cope with such problems, as shown by the foveal anatomy of the eye and its high mobility and control accuracy. The design principles of systems that adaptively find and selects relevant information are important for both Robotics and Cognitive Neuroscience. 
At the same time, collaborative robotics has recently progressed to human-robot interaction in real manufacturing. Measuring and modeling task specific gaze behaviours is mandatory to support smooth human robot interaction. Indeed, anticipatory control for human-in-the-loop architectures, which can enable robots to proactively collaborate with humans, heavily relies on observed gaze and actions patterns of their human partners.
The tutorial will describe several systems employing adaptive vision to support robot behavior and their collaboration with humans. 

The systems described employ different strategies:

  • model based systems using information theoretical measures to select perception parameters;
  • neural and bio-inspired perception controllers trained to support task execution;
  • imitation based attention control.

Format: lecture


Table of contents

  1. Introduction to Adaptive Vision for Human Robot Collaboration, Dimitri Ognibene. Introduction to adaptive vision and brief description of bayesian and neural adaptive vision systems for human robot interaction.
  2. Ecological perception: a computational model, Fabio Solari. A computational model of visual perception for action tasks. How the modeled perception can inform the design of human-robot interaction systems. 
  3. Attention during social interaction . Tom Foulsham. A review of work from experimental psychology which examines how humans pay attention to each other during conversation and other interactive situations. Studying this behaviour requires moving to more ecologically valid situations, and the results have implications for real and virtual interaction.
  4. Introduction to the Projective Consciousness Model, David Rudrauf. Emergent psychology-inspired cybernetic frameworks for integrating perception, imagination, emotion, social cognition and action in global optimization solutions for autonomous virtual and robotic agents.
  5. Adaptive vision strategies to cope with complex environment under bounded resources, Guido De Croon. Presentation of several adaptive vision strategies implemented enabling robots with limited computational resources to perform complex tasks.
  6. Introduction to Egovision in Human Robot Interaction, Giovanni Farinella. Egovision provides important information for human machine interaction and poses specific problems which require insights from both cognitive and machine learning side.
  7. Ecological interaction in Virtual and Augmented Reality, Manuela Chessa. Virtual and Augmented Reality (VR/AR) environments provide novel interaction modalities. On the one hand, we can devise natural and ecological techniques that mimic real-world situations, on the other hand we aim to understand whether VR/AR could enhance the cognitive performance in several fields of applications (e.g. rehabilitation or training tasks) by exploring novel paradigms of interaction.
  8. Modelling and imitation attentional behaviours in complex tasks, Fiora Pirri. Human visual exploration provides an important source of information to enable robot vision with task specific selection strategies necessary to deal with the complexity of real world. Techniques for recording and replication in robot of such strategies will be reviewed.
  9. Attention Measurement Technologies for Situation Awareness and Motivation in Human-Robot Collaboration. Lucas Paletta. The organisation of attention is investigated in the context of situation awareness, task coordination and mental processes.Classification of gaze behavior in reference with ecological semantics is applied to assistance in assembly as well as in health care.



Dimitri Ognibene , PhD, has joined University of Essex as Lecturer in Computer Science and Artificial Intelligence in October 2017. He was Marie Curie Fellow at Universitat Pompeu Fabra, focusing on the development of algorithms for intelligent social agents with bounded computational and sensory resources. Before he has been developing algorithms for active vision in industrial robotic tasks as a Research Associate (RA) at Centre for Robotics Research, Kings College London; devising Bayesian methods and robotic models for attention in social and dynamic environments as a RA at the Personal Robotics Laboratory in Imperial College London. During his PhD he studied the interaction between active vision and autonomous learning in neuro-robotic models at the Institute of Cognitive Science and Technologies of the Italian Research Council (ISTC CNR). He also collaborated with Wellcome Trust Centre for Neuroimaging (UCL) to address the exploration issue in Predictive Coding, the currently dominant neurocomputational modelling paradigm. Dr Ognibene has also been Visiting Researcher at Bounded Resource Reasoning Laboratory in UMass and at University of Reykjavik (Iceland) exploring the symmetries between active sensor control and active computation or metareasoning. Dr Ognibene presented his work in several international conferences on artificial intelligence (IJCAI), adaptation (SAB), and development (ICDL) and published on international peer-reviewed journals. Dr Ognibene was invited to speak at the International Symposium for Attention in Cognitive Systems (2013 and 2014) as well as in other various neuroscience, robotics and machine-learning international venues. In 2017, he organised a workshop on Active Vision in Human Robot Collaboration at ICIAP2017. Dr Ognibene is Associate Editor of Paladyn, Journal of Behavioral Robotics, Review Editor for Frontiers in Bionics and Biomimetics, as well as in Computational Intelligence. He has been part of the chair or program committee member of several international conferences and symposiums.

You can find the slides of the tutorial herehttps://sites.google.com/site/dimitriognibenehomepage/special-avhrc-at-icvs2019 )

List of Relevant Publications:

  • Friston, K., Rigoli, F., Ognibene, D., Mathys, C., Fitzgerald, T., & Pezzulo, G. (2015). Active inference and epistemic value. Cognitive neuroscience, 6(4), 187-214.
  • Lee, K., Ognibene, D., Chang, H. J., Kim, T. K., & Demiris, Y. (2015). Stare: Spatio-temporal attention relocation for multiple structured activities detection. IEEE Transactions on Image Processing, 24(12), 5916-5927.
  • Ognibene, D., & Baldassare, G. (2015). Ecological active vision: Four bioinspired principles to integrate bottom–up and adaptive top–down attention tested with a simple camera-arm robot. IEEE Transactions on Autonomous Mental Development, 7(1), 3-25.
  • Ognibene, D., & Demiris, Y. (2013, August). Towards Active Event Recognition. In IJCAI (pp. 2495-2501).
  • Ognibene, D., Chinellato, E., Sarabia, M., & Demiris, Y. (2013). Contextual action recognition and target localization with an active allocation of attention on a humanoid robot. Bioinspiration & biomimetics, 8(3), 035002. 

Contact information: Dimitri Ognibene, dimitri.ognibene@essex.ac.uk , School of Computer Science and Electronic Engineering, University of Essex, UK

Manuela Chessa is Assistant Professor in Computer Science at Dept. of Informatics, Bioengineering, Robotics, and Systems Engineering of the University of Genoa. Her research interests are focused on the development of natural human-machine interfaces based on virtual, augmented and mixed reality, on the perceptual and cognitive aspects of interaction in VR and AR, on the development of bioinspired models, and on the study of biological and artificial vision systems. She studies the use of novel sensing and 3D tracking technologies and of visualization devices (e.g. 3D monitors and projectors, head-mounted-displays, video see-through and optical see- through devices) to develop natural and ecological interaction systems, always having in mind the human perception. In particular, she is active in studying misperception issues, visual stress and fatigue that arise by using such systems. She is the Program Co-chair of the HUCAPP International Conference on Human Computer Interaction Theory and Applications. She has been the chair of the BMVA Technical Meeting – Vision for human-computer interaction and virtual reality systems, London 6th May 2015, the chair of the Special Session Computer VISION for Natural Human Computer Interaction – VISION4HCI 2016, Lecturer of the tutorial Natural Human-Computer-Interaction in Virtual and Augmented Reality, VISIGRAPP 2017, and Lecturer for the tutorial Active Vision and Human Robot Collaboration at ICIAP 2017, with the talk "Human-Agent Interaction in Virtual and Augmented Reality”. She is the organizer of a tutorial at ISMAR2018 “Cognitive Aspects of Interaction in Virtual and Augmented Reality Systems” (CAIVARS). She is author of more than 60 papers in international book chapters,journals and conference proceedings, and co-inventor of 3 patents.

List of Relevant Publications:

  • M. Chessa, G. Maiello, A. Borsari, PJ Bex (2019) The Perceptual Quality of the Oculus Rift for Immersive Virtual Reality. Human Computer Interaction, vol. 34(1), pp. 51-82 
  • M. Chessa, G. Maiello, L.K. Klein, V. C .Paulun and F. Solari (2019) Grasping objects in immersive Virtual Reality PERCAR 2019: The IEEE VR 2019 Workshop on Perceptual and Cognitive Issues in AR with IEEE VR 2019
  • G.Ballestin, F. Solari, and M. Chessa. (2018) Perception and action in peripersonal space: a comparison between video and optical see-through augmented reality devices. In Adjunct Proceedings of the IEEE International Symposium for Mixed and Augmented Reality 2018
  • M. Chessa, & F. Solari (2017). [POSTER] Walking in Augmented Reality: An Experimental Evaluation by Playing with a Virtual Hopscotch. In Mixed and Augmented Reality (ISMAR-Adjunct), 2017 IEEE International Symposium on (pp. 143-148).
  • E. Gusai, C. Bassano, F. Solari, & M. Chessa (2017). Interaction in an Immersive Collaborative Virtual Reality Environment: A Comparison Between Leap Motion and HTC Controllers. In International Conference on Image Analysis and Processing (pp. 290-300). Springer, Cham.

Contact information: Manuela Chessa, manuela.chessa@unige.it, Dept. of Informatics, Bioengineering, Robotics, and Systems Engineering, University of Genova, Italy.

Giovanni Maria Farinella is a Tenure Track Associate Professor at the Department of Mathematics and Computer Science, University of Catania, Italy. His research interests lie in the fields of Computer Vision, Pattern Recognition and Machine Learning. He is author of more than 100 papers in international book chapters, journals and conference proceedings, and co-inventor of 4 patents involving industrial partners. Dr. Farinella serves as a reviewer and on the board programme committee for major international journals and conferences (CVPR, ICCV, ECCV, BMVC). He has been Video Proceedings Chair for ECCV 2012 and ACM MM 2013, Dr. Farinella founded (in 2006) and currently directs the International Computer Vision Summer School (ICVSS). He also founded (in 2014) and currently directs the Medical Imaging Summer School (MISS). Dr. Farinella was awarded the PAMI Mark Everingham Prize in October 2017.

List of Relevant Publications:

  • A. Furnari, S. Battiato, G. M. Farinella ( 2018 ). Personal-Location-Based Temporal Segmentation of Egocentric Video for Lifelogging Applications. Journal of Visual Communication and Image Representation , 52 , pp. 1-12
  • A. Furnari, G. M. Farinella, S. Battiato ( 2017 ). Recognizing Personal Locations From Egocentric Videos. IEEE Transactions on Human-Machine Systems , 47 ( 1 ), pp. 6-18 . 
  • D. Damen, H. Doughty, G. M. Farinella, S. Fidler, A. Furnari, E. Kazakos, D. Moltisanti, J. Munro and T. Perrett, W. Price, M. Wray (2018). Scaling Egocentric Vision: The EPIC-KITCHENS Dataset. In European Conference on Computer Vision .
  • A. Furnari, S. Battiato, G. M. Farinella ( 2018 ). Leveraging Uncertainty to Rethink Loss Functions and Evaluation Measures for Egocentric Action Anticipation . In International Workshop on Egocentric Perception, Interaction and Computing (EPIC) in conjunction with ECCV
  • A. Furnari, S. Battiato, K. Grauman, G. M. Farinella ( 2017 ). Next-active- object prediction from egocentric videos . Journal of Visual Communication and Image Representation , 49 ( Supplement C ), pp. 401 - 411.
  • A. Ortis, G. M. Farinella, V. D’Amico, L. Addesso, G. Torrisi, S. Battiato (2017). Organizing egocentric videos of daily living activities. Pattern Recognition, 72(Supplement C), pp. 207 - 218.
  • Other relevant publications on Egocentic (First Person) Vision are available at: http://iplab.dmi.unict.it/fpv

Contact information: Giovanni Maria Farinella, gfarinella@dmi.unict.it , Department of Mathematics and Computer Science, University of Catania, Italy

Fabio Solari is Associate Professor of Computer Science at Department of Informatics, Bioengineering, Robotics, and Systems Engineering of the University of Genoa. His research activity concerns the study of visual perception with the aim to develop computational models of cortical vision processing, to devise novel bio-inspired computer vision algorithms, and to design virtual and mixed reality environments for ecological visual stimulations. In particular, his research interests are related to (i) computational models for motion and depth estimation, space-variant visual processing, and scene interpretation, and (ii) to the perceptual assessment of virtual/augmented reality systems and to the development of natural human-computer interactions. He has participated to five European projects: FP7-ICT, EYESHOTS and SEARISE; FP6-IST-FET, DRIVSCO; FP6-NEST, MCCOOP; FP5-IST-FET, ECOVISION. Currently, he is involved in two European Interreg Alcotra projects, CLIP and PROSOL. He is a reviewer for Italian PRIN and FIRB projects, and Marie Curie fellowships. He has a pending International Patent Application (WO2013088390) on augmented reality, and two Italian Patent Applications on virtual (No. 0001423036) and augmented (No. 0001409382) reality.

List of Relevant Publications:

  • M. Chessa, F. Solari. A Computational Model for the Neural Representation and Estimation of the Binocular Vector Disparity from Convergent Stereo Image Pairs. International journal of neural systems, 2018.
  • M. Chessa and F. Solari. Walking in Augmented Reality: An Experimental Evaluation by Playing with a Virtual Hopscotch. Mixed and Augmented Reality (ISMAR-Adjunct), 2017 IEEE International Symposium on. pp. 143-148, October 9-13, Nantes, France, 2017.
  • M. Chessa, G. Maiello, P.J. Bex, F. Solari. A space-variant model for motion interpretation across the visual field. Journal of Vision, 16(2):12, 1–24, 2016.
  • F. Solari, M. Chessa, S.P. Sabatini. An integrated neuromimetic architecture for direct motion interpretation in the log-polar domain. Computer Vision and Image Understanding, 125, pp. 37-54, 2014.
  • F. Solari, M. Chessa, M. Garibotti, S.P. Sabatini. Natural perception in dynamic stereoscopic augmented reality environments. Displays, 34(2), pp. 142-152, 2013. 

Contact information: Fabio Solari, fabio.solari@unige.it, Dept. of Informatics, Bioengineering, Robotics, and Systems Engineering, University of Genova, Italy.

David Rudrauf is a Psychologist and Neuroscientist. He obtained a Ph.D. in Cognitive Sciences from Pierre and Marie Curie University in Paris (2005), a Ph.D. in Neurosciences from the University of Iowa in Iowa City, USA (2005), and a Habilitation from the University Joseph Fourier in Grenoble France (2015). He has worked as an Assistant Professor of Neurology & Radiology, in the Department of Neurology of the University of Iowa in Iowa City, USA (2008-2011); a non-permanent research scientist at the Laboratoire d'Imagerie Fonctionelle (INSERM, UMR S 678), Paris, France (2012-2014); a non-permanent research scientist at the Grenoble Institute of Neuroscience GIN (INSERM), Grenoble, France (2014-2016). He is now (Associate) Professor of Psychology, at the University of Geneva, FAPSE, Section of Psychology, a member of the Swiss Center for Affective Sciences, Campus Biotech, and of the University Computer Science Center.. He has been the Director of the Laboratory of Brain Imaging and Cognitive Neuroscience, Division of Behavioral Neurology and Cognitive Neuroscience, Department of Neurology, UIHC, University of Iowa (2008-2011), and is currently since 2016 the Director of the Laboratory of Multimodal Modelling of Emotion & Feeling. After years of research in neuroscience, neuropsychology, electrophysiology and multimodal neuroimaging in the US and in France, his current research has moved to mathematical psychology, and the development of a computational model of embodied consciousness (Rudrauf et al, 2017), combined with Virtual Reality and robotics, in order to study the normal and pathological mechanisms of the mind, and develop cybernetic frameworks for autonomous systems, with a focus on imagination, social perspective taking and emotion regulation, and their relations to behaviour and the brain.

List of Relevant Publications:

  • Rudrauf, D., Bennequin, D., Granic, I., Landini, G., Friston, K., & Williford, K. (2017). A mathematical model of embodied consciousness. Journal of theoretical biology, 428, 106-131.
  • Rudrauf, D., & Debbané, M. (2018). Building a cybernetic model of psychopathology: beyond the metaphor. Psychological Inquiry, 29(3), 156-164. 
  • Williford, K., Bennequin, D., Friston, K., & Rudrauf, D. (2018). The Projective Consciousness Model and Phenomenal Selfhood. Frontiers in psychology, 9. 
  • Meuleman, B., & Rudrauf, D. (2018). Induction and profiling of strong multi-componential emotions in virtual reality. IEEE Transactions on Affective Computing.
  • Ognibene, D., Giglia, G., Marchegiani, L., & Rudrauf, D. (2019). Implicit Perception Simplicity and Explicit Perception Complexity in Sensorimotor Comunication. Physics of Life Reviews.

Contact information: David Rudrauf, David.Rudrauf@unige.ch, Laboratory of multimodal modelling of emotion-feeling, University of Geneva, Switzerland

Tom Foulsham is a Reader in the Department of Psychology at the University of Essex, where he has been teaching and conducting research since 2011. He obtained BSc and PhD degrees from the University of Nottingham (UK) and was a Commonwealth Postdoctoral Fellow at the University of British Columbia (Canada) from 2008 to 2011. Dr Foulsham is a cognitive neuroscientist with expertise in the biology and psychology of human eye movements. He is the author of more than 50 peer-reviewed journal articles investigating human perception and cognition and spanning domains from psychology and sports science to human-computer interaction. He is a fellow of the Psychonomics Society and the Vision Sciences Society, and is currently an Associate Editor of Visual Cognition and a Consulting Editor at the Journal of Experimental Psychology: Human Perception and Performance.

List of Relevant Publications:

  • Solman, G. J. F., Foulsham, T., & Kingstone, A. (2017). Eye and head movements are complementary in visual selection. Royal Society Open Science , 4 (1). 
  • Foulsham, T., & Kingstone, A. (2017). Are fixations in static natural scenes a useful predictor of attention in the real world?. Canadian Journal of Experimental Psychology/Revue canadienne de psychologie experimentale , 71 (2), 172-181. 
  • Ho, S., Foulsham, T. and Kingstone, A (2015) Speaking and Listening with the Eyes: Gaze Signaling during Dyadic Interactions. PloS One , 10 (8). e0136905.
  • Foulsham, T. & Lock, M. (2015). How the eyes tell lies: Social gaze during a preference task. Cognitive Science , 39, 1704-1726. 
  • Foulsham, T. (2015). Eye movements and their functions in everyday tasks. Eye , 29(2), 196-199.

Contact information: Tom Foulsham, foulsham@essex.ac.uk , Department of Psychology, University of Essex, UK.

Fiora Pirri, Sapienza, Università di Roma, Professor of Computer Science at Dipartimento di Ingegneria Informatica, Automatica e Gestionale, Sapienza, Università di Roma. Obtained the PhD from the Universitè Pierre et Marie Curie Paris VI. She currently leads lead ALCOR Vision, Perception, and Learning Laboratory which she founded in 1998.

List of Relevant Publications:

M. Sanzari, V. Ntouskos, F. Pirri. Bayesian image based 3D pose estimation. In proceedings of the European Conference on Computer Vision (ECCV), 9912, 566-582, 2016
F. Natola, V. Ntouskos, F. Pirri, M. Sanzari. Single image object modeling based on BRDF and r-surfaces learning. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4414-4423, 2016
M. Gianni, M. Ruiz, F. Ferri, F. Pirri. Terrain contact modeling and classification for ATVs. IEEE International Conference on Robotics and Automation (ICRA-2016) 
F. Natola, V. Ntouskos, M. Sanzari, F. Pirri. Bayesian non-parametric inference for manifold based MoCap representation. In Proceedings of the International Conference on Computer Vision (ICCV), 4606-4614, 2015

Contact information: Fiora Pirri, pirri@dis.uniroma1.it , Professor of Computer Science at Dipartimento di Ingegneria Informatica, Automatica e Gestionale, Sapienza, Università di Roma

Guido De Croon, Delft University, Guido de Croon is assistant-professor at the Micro Air Vehicle lab of Delft University of Technology, the Netherlands. His research interest lies with computationally efficient algorithms for robot autonomy, with a particular focus on computer vision and evolutionary robotics. 

List of Relevant Publications:

  • (2016), de Croon, G.C.H.E., Monocular distance estimation with optical flow maneuvers and efference copies: a stability based strategy , in Bioinspiration and Biomimetics, vol. 11, number 1.
  • (2011), de Croon, G.C.H.E., Postma, E.O., van den Herik, H.J. “Adaptive Gaze Control for Object Detection.” In Cognitive Computation, Volume 3, Number 1, pages 264-278 
  • (2009), de Croon G.C.H.E., Sprinkhuizen-Kuyper I.G., Postma E.O., “Comparison of Active Vision Models.” , Image and Vision Computing, Volume 27, Issue 4.

Contact information:Guido De Croon, g.c.h.e.decroon@tudelft.nl , assistant-professor at the Micro Air Vehicle lab of Delft University of Technology, the Netherlands.

Lucas Paletta is R&D project manager, key researcher and Head of the Human Factors Lab at JOANNEUM RESEARCH Forschungsgesellschaft mbH, DIGITAL – Institute for Information and Communication Technologies, Graz. Austria. He holds a diploma (1996) and PhD (2000) in Computer Science from Graz University of Technology, received a scholarship at Johns Hopkins University, Baltimore (1995), was research assistant at the Institute for Computer Graphics & Vision at Graz University of Technology (1996-1998), and researcher at Fraunhofer Institute for Autonomous Intelligent Systems, Germany (1998-2000). Since 2000 he worked at JOANNEUM RESEARCH as head of research group, project manager and principal investigator of several national and EU-funded R&D projects. He was founder and chair of several interdisciplinary symposia on attention in cognitive systems (WAPCV / ISACS at ICVS, ECCV, IJCAI, CVPR, IROS ). His current focus of research is on human factors based measurement technologies in the context of intuitive interfaces and computational modelling of executive functions for assistance in human-robot interaction, such as, in assembly tasks. A second focus of research is on the role of attention in pervasive assessment of neuropsychological diseases, such as, Alzheimer, with socially assistive robots.

List of Relevant Publications:

  • Dini, A., Murko, C., Paletta, L., Yahyanejad, S., Augsdörfer, U., and Hofbaur, M. (2017) Measurement and Prediction of Situation Awareness in Human-Robot Interaction based on a Framework of Probabilistic Attention, Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2017, Vancouver, Canada, 
  • Polatsek, P., Benesova, W., Paletta, L., and Perko, R. (2016). Novelty-based Spatiotemporal Saliency Detection for Prediction of Gaze in Egocentric Video, IEEE Signal Processing Letters, 23(3): 394-398, IEEE Press, 2016.
  • Santner, K., Fritz, G., Paletta, L., and Mayer, H. (2013). Visual recovery of saliency maps from human attention in 3D environments. Proc. IEEE International Conference on Robotics and Automation, ICRA 2013, p. 4282-4288, Karlsruhe, Germany, May 6-10, 2013. 
  • Paletta, L., Lerch, A., Kemp, C., Pittino, L., Steiner, J., Panagl, M. Künstner, M., Pszeida, M., and Fellner, M. (2018). Playful Multimodal Training for Persons with Dementia with Executive Function based Diagnostic Tool, Proc. Pervasive Technologies Related to Assistive Environments (PETRA), Corfu, Greece, June 26-29, 2018, ACM Press.
  • Paletta, L., Dini, A., and Pszeida (2019). Emotion Measurement from Attention Analysis on Imagery in Virtual Reality, Proceedings of the AHFE 2019 International Conference on Human Factors and Ergonomics, Washington, DC, 2019, to be published in Springer.
  • Paletta, L., Pszeida, M., Marton, R., and Nauschnegg, B. (2019). Stress Measurement in Multi-Tasking Decision Processes Using Executive Functions Analysis, Proceedings of the AHFE 2019 International Conference on Human Factors and Ergonomics, Washington, DC, 2019, to be published in Springer.