My core research interest is in machine learning for interactive systems that maximizes a utility function by taking actions, which is in contrast to prediction-oriented machine learning like supervised learning. My areas of focus are reinforcement learning, including the important subclass known as contextual bandits; I am also interested in related areas such as large-scale online learning with big data, active learning, and planning. In the past, I have applied my work to recommendation, Web search, advertising, conversation systems, and spam detection.
More information can be found in Google Scholar, DBLP, LinkedIn, Google AI page, and my full CV. Here is a short bio.