My core research interest is in machine learning for interactive systems that maximizes a utility function
by taking actions, which is in contrast to prediction-oriented machine learning like supervised learning.
My area of focus is reinforcement learning, including the important subclass known as contextual bandits;
I am also interested in related areas such as large-scale online learning with big data, active learning, and planning.
In the past, I have applied my work to recommendation, Web search, advertising, and conversation systems.
Most of my work can be grouped into several clusters:
More information can be found in Google Scholar, DBLP, LinkedIn, Google AI page.