Essays on Artificial Intelligence, Reinforcement Learning, and Structural Econometrics
Restricted Access
Creator
Zhang, Weipeng
Advisor
Rust, John
Abstract
The first chapter presents an empirical study of airline pricing. Airlines offer passengers different ways to fly from origin to destination via complex hub and spoke networks. Despite the increasing use of sophisticated revenue management pricing algorithms, I argue that computational complexity precludes optimal dynamic pricing over an airline's entire network of itineraries, forcing them to rely on heuristics to approximate optimal prices of individual flights but ignoring interactions over the network as a whole. I estimate a dynamic structural model of airline pricing with a novel high frequency data set on flight prices and seat availability. I relax the assumption of optimality and resolve classic econometric issues of endogeneity, censoring, and truncation to recover the airline's beliefs on its stochastic demand process. I formulate and solve the optimal dynamic network pricing problem for simple networks and elucidate the channels through which pricing externalities are transmitted in the network. I compute counterfactual network-perfect pricing policies, under which airlines could increase revenue by 2-10%, and analyze implications for consumer welfare.
In the second chapter, I propose a distributed randomized policy iteration algorithm for infinite horizon dynamic programming problems for which the control at each stage is m-dimensional. The traditional policy iteration algorithm involves performing a minimization over an m-dimensional constraint set and has a computational complexity that increases exponentially in m, resulting in an intractable combinatorial search problem. In each iteration, our algorithm performs a series of sequential minimizations followed by policy evaluation and policy improvement using the policy that attains the minimum cost over the sequential minimizations. The algorithm is well-suited for parallel computation, has a complexity that increases linearly in m, and converges to an agent-by-agent optimal policy. I characterize sufficient conditions for which our algorithm generates a globally optimal policy that coincides with that obtained from standard policy iteration.
Description
Ph.D.
Permanent Link
http://hdl.handle.net/10822/1064599Date Published
2022Type
Embargo Lift Date
2024-06-21
Publisher
Georgetown University
Extent
159 leaves
Collections
Metadata
Show full item recordRelated items
Showing items related by title, author, creator and subject.
-
The Effects of Social Reward on Reinforcement Learning
D'Mell, Anila; D'Mell, Anila (2012-05-01)Learning and changing behavior based on feedback, referred to as reinforcement learning, is an important method by which people plan actions in order to maximize reward. This study aimed to examine the effects of social ...