My core research interest is in machine learning for interactive systems that maximizes a utility function by taking actions, which is in contrast to prediction-oriented machine learning like supervised learning. My areas of focus are reinforcement learning, including the important subclass known as contextual bandits; I am also interested in related areas such as large-scale online learning with big data, active learning, and planning. In the past, I have applied my work to recommendation, Web search, advertising, conversation systems, and spam detection.
Most of my work can be grouped into several clusters:
More information can be found in Google Scholar, DBLP, LinkedIn, Google AI page, and my full CV. Here is a short bio.