KDD 2017 | LinkedIn Engineering

LinkedIn Tutorials at KDD 2017

See the full list on the KDD 2017 website

Deep Learning for Personalized Search and Recommender Systems

Nadia Fawaz, Saurabh Kataria, Benjamin Le, Ganesh Venkataraman, Liang Zhang (LinkedIn)

Deep learning has been widely successful in solving complex tasks such as image recognition (ImageNet), speech recognition, machine translation, etc. In the area of personalized recommender systems, deep learning has started showing promising advances in recent years. The key to success of deep learning in personalized recommender systems is its ability to learn distributed representations of users’ and items’ attributes in low dimensional dense vector space and combine these to recommend relevant items to users. To address scalability, the implementation of a recommendation system at web scale often leverages components from information retrieval systems, such as inverted indexes where a query is constructed from a user’s attribute and context, learning to rank techniques. Additionally, it relies on machine learning models to predict the relevance of items, such as collaborative filtering. In this tutorial, we present ways to leverage deep learning towards improving recommender system. The tutorial is divided into three parts: (1) In the first part, we will present an overview of concepts in deep learning which are pertinent to recommender systems including sequence modeling, word embedding and named entity recognition. (2) In the second part, we will present how these fundamental building blocks can be used to improve a recommender system at scale. (3) The third part presents a few case studies from large scale recommender systems at LinkedIn and some of the challenges we faced while getting deep learning to work in production.

LinkedIn Workshops at KDD 2016

See the full list on the KDD 2017 website

2017 KDD WISDOM Workshop

Yongzheng Zhang (LinkedIn), Erik Cambria (NTU, Singapore), Bing Liu (UIC), Yunqing Xia (Microsoft)

The KDD WISDOM (Workshop on Issues of Sentiment Discovery and Opinion Mining) series aims to explore how the wisdom of the crowds is affecting (and will affect) the evolution of the Web and of businesses gravitating around it. It provides an international forum for researchers in the field of machine learning for opinion mining and sentiment analysis to share information on their latest investigations in social information retrieval and their applications both in academic research areas and industrial sectors. The broader context of the workshop comprehends opinion mining, social media marketing, information retrieval, and natural language processing. The 6th KDD WISDOM workshop features invited talks plus peer reviewed papers on latest research and applications in this area.

Read more: http://sentic.net/wisdom/.

2017 AdKDD & TargetAd Workshop

Kun Liu (LinkedIn), Abraham Bagherjeiran (A9), Mihajlo Grbovic (Airbnb), Kuang-Chih Lee (Yahoo), Vladan Radosavljevic (Uber), Suju Rajan (Criteo)

There have been a total of 10 AdKDD and TargetAd workshops to date, organized every year since 2007, which focused on highlighting state-of-the-art advances in computational advertising. All the workshops were well attended, often with standing room only, and very well received both by the academic community and the advertising industry. Motivated by these successes, for 2017 we are happy to announce a joint edition of AdKDD and TargetAd, which we believe will bring an even stronger program than the past years. We expect to host 12 paper presentations and 4 featured invited talks by Susan Athey (Stanford University), Thorsten Joachims (Cornell University), Randall Lewis (Netflix), and Alex Smola (Amazon). We look forward to seeing you in Halifax to discuss the past, present, and future of computational advertising!

Workshop on Advancing Data With Education

Andrew Lan (Princeton University) , Christopher G. Brinton (Zoomi) , Jiquan Ngiam (Coursera), Mung Chiang (Princeton University) , Richard Baraniuk (Rice University) , Roshan Sumbaly (Coursera) , Shivani Rao (LinkedIn)

Online education has gained popularity in the recent years, having an impact from K-12 to lifelong learners. The data collected through these technologies presents a golden opportunity to develop data-driven methods to improve education. This workshop aims to bring together experts in education, machine learning, data mining, natural language processing, and HCI to lay out the challenges and opportunities in the use of data science tools in online learning. In particular, we aim to bridge the gap between academic research on machine learning algorithms for educational data analysis and challenges in real-world applications encountered in the industry, especially in the context of lifelong learning.

LinkedIn Research Papers at KDD 2017

See the full list on the KDD 2017 website

LiJAR: A System for Job Application Redistribution towards Efficient Career Marketplace

Fedor Borisyuk (Facebook), Liang Zhang, Krishnaram Kenthapadi (LinkedIn)

LinkedIn's job recommendations product is a key vehicle for efficient matching between potential candidates and job postings. We have observed in practice that a subset of job postings receive too many applications (due to several reasons such as the popularity of the company, nature of the job, etc.), while some other job postings receive too few applications. Both cases can result in job poster dissatisfaction and may lead to discontinuation of the associated job posting contracts. At the same time, if too many job seekers compete for the same job posting, each job seeker's chance of getting this job will be reduced. In the long term, this reduces the chance of users finding jobs that they really like on the site.

In this paper, we propose LiJAR, LinkedIn's Job Applications Forecasting and Redistribution system, with the goal of ensuring that job postings do not receive too many or too few applications, while still providing job recommendations to users with the same level of relevance. Our production deployment of this system as part of LinkedIn's job recommendation engine has resulted in significant increase in the engagement of users for underserved jobs (6.5%) without affecting the user engagement in terms of the total number of job applications, thereby addressing the needs of job seekers as well as job providers simultaneously.

Detecting Network Effects: Randomizing Over Randomized Experiments

Martin Saveski (MIT), Jean Pouget-Abadie (Harvard), Guillaume Saint-Jacques (MIT), Weitao Duan, Souvik Ghosh, Ya Xu (LinkedIn), Edo Airoldi (Harvard)

Randomized experiments, or A/B tests, are the standard approach for evaluating the causal effects of new product features, i.e., treatments. The validity of these tests rests on the “stable unit treatment value assumption” (SUTVA), which implies that the treatment only affects the behavior of treated users, and does not affect the behavior of others. Violations of SUTVA, common in features that exhibit network effects, result in inaccurate estimates of the causal effect of treatment. In this paper, we leverage a new experimental design for testing whether SUTVA holds, without making any assumptions on how treatment effects may spill over between the treatment and the control group. To achieve this, we simultaneously run both a completely randomized and a cluster-based randomized experiment, and then we compare the difference of the resulting estimates. We present a statistical test for measuring the significance of this difference and other theoretical bounds on the Type I error rate. We provide practical guidelines for implementing our methodology on large-scale experimentation platforms. Importantly, the proposed methodology can be applied to settings in which a network is not necessarily observed but, if available, can be used in the analysis. Finally, we deploy this design to LinkedIn’s experimentation platform and apply it to two online experiments, highlighting the presence of network effects and bias in standard A/B testing approaches in a real-world setting.

BDT: Boosting Decision Tables for High Accuracy and Scoring Efficiency

Yin Lou (Airbnb), Mikhail Obukhov (LinkedIn)

In this work we present gradient boosted decision tables (BDTs). A d-dimensional decision table is essentially a mapping from a sequence of d boolean tests to a real value. We present novel algorithms to fit decision tables and develop novel data structures to support fast scoring. In our experiments, we observe that BDT improves both accuracy and scoring efficient. We complement our experimental evaluation with a bias-variance analysis that explains how different weak models influence the predictive power of the boosted ensemble. Our experiments suggest gradient boosting with randomly backfitted decision tables distinguishes itself as the most accurate method on a number of classification and regression problems. We have deployed a BDT model to LinkedIn news feed system and achieved significant lift in key metrics.

Data-Driven Reserve Prices for Social Advertising Auctions at LinkedIn

Tingting Cui (Houzz), Lijun Peng, David Pardoe, Kun Liu, Deepak Agarwal, Deepak Kumar (LinkedIn)

Online advertising auctions constitute an important source of revenue for search engines such as Google and Bing, as well as social networks such as Facebook, LinkedIn, and Twitter. We study the problem of setting the optimal reserve price in a Generalized Second Price auction, guided by auction theory with suitable adaptations to social advertising at LinkedIn. Two types of reserve prices are deployed: one at the user level, which is kept private by the publisher, and the other at the audience segment level, which is made public to advertisers. We demonstrate through field experiments the effectiveness of this reserve price mechanism to promote demand growth, increase ads revenue, and improve advertiser experience.