0c671106cad013fa.tex
1: \begin{abstract}
2: Most current distributed machine learning systems try to scale up model training by using a data-parallel architecture that divides the computation of gradients for different samples among workers. We study distributed machine learning from a different motivation, where the information about the same samples, e.g., users and objects, are owned by several parities that wish to collaborate but do not want to share raw data with each other. 
3: We propose an asynchronous stochastic gradient descent (SGD) algorithm for such a feature distributed machine learning (FDML) problem, to jointly learn from distributed features, with theoretical convergence guarantees under bounded asynchrony. Our algorithm does not require sharing the original features or even local model parameters between parties, thus preserving the data locality. The system can also easily incorporate differential privacy mechanisms to preserve a higher level of privacy. We implement the FDML system in a parameter server architecture and compare our system with fully centralized learning (which violates data locality) and learning based on only local features, through extensive experiments performed on both a public data set {\emph a9a}, and a large dataset of $5,000,000$ records and $8700$ decentralized features from three collaborating apps at Tencent including {\emph Tencent MyApp}, {\emph Tecent QQ Browser} and {\emph Tencent Mobile Safeguard}. Experimental results have demonstrated that the proposed FDML system can be used to significantly enhance app recommendation in  Tencent MyApp by leveraging user and item features from other apps, while preserving the locality and privacy of features in each individual app to a high degree.
4: \end{abstract}
5: