abstract:77cb3f66becebb4d.tex

1: \begin{abstract}

2: We propose a data-driven framework to enable the modeling

3: and optimization of human-machine interaction processes,

4: e.g., systems aimed at assisting humans in decision-making

5: or learning, work-load allocation, and interactive advertising.

6: This is a challenging problem for several reasons.

7: First, humans' behavior is hard to model or

8: infer, as it may reflect biases, long term memory, and

9: sensitivity to sequencing, i.e., transience and exponential complexity

10: in the length of the interaction.

11: Second, due to the interactive nature of such processes,

12: the machine policy used to engage with human

13: may bias possible data-driven inferences.

14: Finally, in choosing machine policies that optimize interaction rewards,

15: one must, on the one hand, avoid being

16: overly sensitive to error/variability in the estimated

17: human model, and on the other, being overly

18: deterministic/predictable which may result in poor

19: human `engagement' in the interaction.

20: To meet these challenges, we propose a robust

21: approach, based on the maximum entropy principle, which

22: iteratively estimates human behavior and optimizes the machine

23: policy--Alternating Entropy-Reward Ascent (AREA) algorithm.

24: We characterize AREA, in terms of its space and time complexity and

25: convergence.  We also provide an initial validation based on synthetic

26: data generated by an established noisy nonlinear model for

27: human decision-making.

28: \end{abstract}

29: