1: \begin{abstract} % is it too wordy?
2: We extend the QLBS model by reformulating via considering a large trader
3: whose transactions leave a permanent impact on the evolution
4: of the exchange rate process and therefore affect the price
5: of contingent claims on such processes.
6: Through a hypothetical limit order book we quantify the exchange rate
7: altered by such transactions.
8: We therefore define the quoted exchange rate process,
9: for which we assume the existence of a postulated hedging strategy.
10: Given the quoted exchange rate and postulated hedging strategy,
11: we find an optimal hedging strategy through
12: batch-mode reinforcement learning given the trader alters
13: the course of the exchange rate process.
14: We assume that the trader has its own concept of fair price
15: and we define our problem as finding the hedging strategy
16: with much lower transaction costs yet delivering a price
17: that well converges to the fair price of the trader.
18: We show our contribution results in an optimal hedging strategy
19: with much lower transaction costs and convergence to the fair price
20: is obtained assuming sensible parameters.
21: \end{abstract}