POMDP Model Learning for Human Robot Collaboration

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Recent years have seen human robot collaboration (HRC) quickly emerged as a hot research area at the intersection of control, robotics, and psychology. While most of the existing work in HRC focused on either low-level human-aware motion planning or HRC interface design, we are particularly interested in a formal design of HRC with respect to high-level complex missions, where it is of critical importance to obtain an accurate and meanwhile tractable human model. Instead of assuming the human model is given, we ask whether it is reasonable to learn human models from observed perception data, such as the gesture, eye movements, head motions of the human in concern. As our initial step, we adopt a partially observable Markov decision process (POMDP) model in this work as mounting evidences have suggested Markovian properties of human behaviors from psychology studies. In addition, POMDP provides a general modeling framework for sequential decision making where states are hidden and actions have stochastic outcomes. Distinct from the majority of POMDP model learning literature, we do not assume that the state, the transition structure or the bound of the number of states in POMDP model is given. Instead, we use a Bayesian non-parametric learning approach to decide the potential human states from data. Then we adopt an approach inspired by probably approximately correct (PAC) learning to obtain not only an estimation of the transition probability but also a confidence interval associated to the estimation. Then, the performance of applying the control policy derived from the estimated model is guaranteed to be sufficiently close to the true model. Finally, data collected from a driver-assistance test-bed are used to train the model, which illustrates the effectiveness of the proposed learning method.

Related collections

Most cited references 9

Record: found
Abstract: not found
Article: not found

Pseudorandomness

Salil P. Vadhan (2011)

0 comments Cited 27 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

Joint modeling of multiple time series via the beta process with application to motion capture segmentation

Michael Jordan, Emily B. Fox, Michael Hughes … (2014)

0 comments Cited 22 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Efficient Model Learning for Human-Robot Collaborative Tasks

Stefanos Nikolaidis, Keren Gu, Ramya Ramakrishnan … (2014)

We present a framework for learning human user models from joint-action demonstrations that enables the robot to compute a robust policy for a collaborative task with a human. The learning takes place completely automatically, without any human intervention. First, we describe the clustering of demonstrated action sequences into different human types using an unsupervised learning algorithm. These demonstrated sequences are also used by the robot to learn a reward function that is representative for each type, through the employment of an inverse reinforcement learning algorithm. The learned model is then used as part of a Mixed Observability Markov Decision Process formulation, wherein the human type is a partially observable variable. With this framework, we can infer, either offline or online, the human type of a new user that was not included in the training set, and can compute a policy for the robot that will be aligned to the preference of this new user and will be robust to deviations of the human actions from prior demonstrations. Finally we validate the approach using data collected in human subject experiments, and conduct proof-of-concept demonstrations in which a person performs a collaborative task with a small industrial robot.

0 comments Cited 6 times – based on 0 reviews

Preprint

     Review now

Bookmark

All references

Author and article information

Journal

Publication date Created: 29 March 2018

Article

ArXiV ID: 1803.11300

SO-VID: d3023710-422e-4324-b378-d41e92279c77

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories cs.HC cs.RO

Data availability:

POMDP Model Learning for Human Robot Collaboration

Read this article at

Abstract

Related collections

Computer Vision, Deep Learning, Deep Reinforcement Learning, IoT

Most cited references 9

Pseudorandomness

Joint modeling of multiple time series via the beta process with application to motion capture segmentation

Efficient Model Learning for Human-Robot Collaborative Tasks

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 506

Most referenced authors 73