Essays on Artificial Intelligence, Reinforcement Learning, and Structural Econometrics
The first chapter presents an empirical study of airline pricing. Airlines offer passengers different ways to fly from origin to destination via complex hub and spoke networks. Despite the increasing use of sophisticated revenue management pricing algorithms, I argue that computational complexity precludes optimal dynamic pricing over an airline's entire network of itineraries, forcing them to rely on heuristics to approximate optimal prices of individual flights but ignoring interactions over the network as a whole. I estimate a dynamic structural model of airline pricing with a novel high frequency data set on flight prices and seat availability. I relax the assumption of optimality and resolve classic econometric issues of endogeneity, censoring, and truncation to recover the airline's beliefs on its stochastic demand process. I formulate and solve the optimal dynamic network pricing problem for simple networks and elucidate the channels through which pricing externalities are transmitted in the network. I compute counterfactual network-perfect pricing policies, under which airlines could increase revenue by 2-10%, and analyze implications for consumer welfare.In the second chapter, I propose a distributed randomized policy iteration algorithm for infinite horizon dynamic programming problems for which the control at each stage is m-dimensional. The traditional policy iteration algorithm involves performing a minimization over an m-dimensional constraint set and has a computational complexity that increases exponentially in m, resulting in an intractable combinatorial search problem. In each iteration, our algorithm performs a series of sequential minimizations followed by policy evaluation and policy improvement using the policy that attains the minimum cost over the sequential minimizations. The algorithm is well-suited for parallel computation, has a complexity that increases linearly in m, and converges to an agent-by-agent optimal policy. I characterize sufficient conditions for which our algorithm generates a globally optimal policy that coincides with that obtained from standard policy iteration.
Embargo Lift Date
MetadataShow full item record
Showing items related by title, author, creator and subject.
The Effects of Social Reward on Reinforcement Learning D'Mell, Anila; D'Mell, Anila (2012-05-01)Learning and changing behavior based on feedback, referred to as reinforcement learning, is an important method by which people plan actions in order to maximize reward. This study aimed to examine the effects of social ...