My main interest is not just whether a learning algorithm works, but what kind of information the agent receives and what kind of world that information makes visible. In many of the problems I care about, what the agent observes and what the model assumes matter more than the particular update rule used to train it. That thread connects most of the work here, from cooperation problems in reinforcement learning to execution problems in quantitative finance and environmental systems.
Current Work
- Information design and credit assignment in multi-agent cooperation
- Optimal execution with predictive alpha signals
Selected Publications
Abstract
A full-factorial experiment in the Public Goods Game shows that information regime and incentive strength explain 85.8% of cooperation-rate variance; algorithm choice accounts for just 3.8%. Agents with the least information cooperate most (83% vs 42% under full observation), attributable to state-space compression. TreeSHAP and Shapley-variance decomposition confirm information structure, not algorithm selection is the primary design lever for cooperation.
Abstract
The execution of large portfolio transactions requires balancing market impact and adverse price drift. The Almgren-Chriss (2001) framework provides a meanvariance trade-off for martingale price processes, but practitioners often utilize short-term alpha signals. This paper re-evaluates the optimal liquidation problem using Stochastic Optimal Control. By incorporating a mean-reverting alpha signal into the price dynamics, we derive a closed-form solution using the Hamilton-Jacobi-Bellman (HJB) equation. The resulting optimal trading rate is an affine function of the current inventory and the predictive signal. This results in a trajectory that adjusts execution speed to capture transient alpha. This work provides a transparent and additive framework for institutional execution desks.
If any of this is close to a problem you are working on, I am reachable by email — or see Work With Me.