1: \begin{abstract}
2: %{\hao (update the phrasing in abstract based on intro/conclusions)}
3: Effectively operating an electric vehicle charging station (EVCS) is crucial for enabling the rapid transition of electrified transportation. {\KB By utilizing the flexibility of EV charging needs, the EVCS can reduce the total electricity cost for meeting the EV demand.} To solve this problem using reinforcement learning (RL), the dimension of state/action spaces unfortunately grows with the number of EVs, which becomes very large and time-varying. This dimensionality issue affects the efficiency and convergence performance of generic RL algorithms. To this end, we advocate to develop aggregation schemes for state/action according to the emergency of EV charging, or its laxity. A least-laxity first (LLF) rule is used to consider only the total charging power of the EVCS, while ensuring the feasibility of individual EV schedules. In addition, we propose an equivalent state aggregation that can guarantee to attain the same optimal policy. Using the proposed aggregation scheme, the policy gradient method is applied to find the best parameters of a linear Gaussian policy. Numerical tests have demonstrated the performance improvement of the proposed representation approaches in increasing the total reward and policy efficiency over existing approximation-based method.
4: %estimates Q-function with binary feature functions, since the equivalence of proposed MDP and the original MDP leads to less approximation. In addition, the RL parameter results imply that we can further aggregate the state by aggregating large numbers of laxity levels with very small loss of optimality.
5: \end{abstract}
6: