Our data scientists and researchers work to unlock the potential in our data to change the world by empowering professionals to become more productive and successful.

LinkedIn operates the world’s largest professional network on the Internet with more than 433 million members in over 200 countries and territories. This highly structured dataset gives our data scientists and researchers the ability to conduct applied research that fuel LinkedIn’s data driven products including search, social graph, and machine learning systems. As a members first organization, LinkedIn keeps the privacy and security of our members at the forefront in all our research.  

LinkedIn’s team of data scientists and researchers work with huge amounts of data, solve real problems for our members around the world, and publish at major conferences. They work to improve the relevance in our products, contribute to the open source community, and are actively pursuing research in a number of areas, including:

  • Computational advertising
  • Data & graph mining
  • Machine learning & infrastructure
  • Recommender systems
  • Online experimentation and A/B testing
  • Text mining and sentiment analysis
  • Machine translation, cross language text analysis
  • Security & spam
  • Information extraction
  • Content understanding
  • Scalable computing paradigms (MapReduce, Spark, etc.)
  • Network visualization

Featured Projects

Jobs

Many of our Talent Solutions products are built around mathematical models that try answer a simple question: is this opportunity of interest to this member at this time? Answering that question requires a multidisciplinary approach, drawing on tools from machine learning and data mining, often utilizing insights from psychological and sociological research.

 

View page

Experimentation

We built our internal end-to-end A/B testing platform, called XLNT, to quickly quantify the impact of any A/B test in a scientific and controlled manner across our sites and mobile apps. XLNT allows for easy design and deployment of experiments, but it also provides automatic analysis that is crucial in popularizing A/B tests.

 

 

View page

Feed

From catching up on trending news, getting updates from your network, or following a thought leader from ourInfluencer program – the professional information and insights in the LinkedIn Feed have become central to our member experience.

 

 

 

 

View page

NLP

The NLP team provides natural language processing (NLP) tools, analyses and features for use throughout the entire company. Our mission is to take unstructured text, analyze it along with information from our structured and semi-structured sources, and produce useful structured representations for all of LinkedIn's current and future product areas. 

 

 

View page

Distributed Data Systems

The Distributed Data Systems (DDS) team builds horizontally scalable data storage and streaming systems used to serve LinkedIn applications.  We have a diverse portfolio of technology solutions summarized below.

 

 

 

 

View page

Data Network & Analytics (DNA)

Data processing is central to creating many of the data products on linkedin.com as well as understanding the performance of our products and businesses. Our analytics platforms, data pipelines,  and data applications allow data consumers at LinkedIn to understand the data and create insights that power both internal and external user experiences.

 

View page

Featured publications

Ambry: LinkedIn's Scalable Geo-Distributed Object Store

Authors: Shadi Abdollahian Noghabi (University of Illinois), Sriram Subramanian (LinkedIn Corp), Priyesh Narayanan (LinkedIn Corp), Sivabalan Narayanan (LinkedIn Corp), Gopalakrishna Holla (LinkedIn Corp), Mammad Zadeh (LinkedIn Corp), Tianwei Li (LinkedIn Corp), Indranil Gupta (University of Illinois at Urbana-Champaign), Roy Campbell (University of Illinois at Urbana-Champaign)

 

Published:SIGMOD2016

 

Abstract: The infrastructure beneath a worldwide social network has to continually serve billions of variable-sized media objects such as photos, videos, and audio clips....

Truss Decomposition of Probabilistic Graphs: Semantics and Algorithms

Authors: Xin Huang (UBC), Wei Lu (University of British Columbia, Linkedin Corp), Laks Lakshmanan (University of British Columbia)

 

Published: SIGMOD 2016

 

Abstract: A key operation in network analysis is the discovery of cohesive subgraphs. The notion of k-truss has gained considerable popularity in this regard, based on its rich structure and efficient computability. However, many complex networks such as social, biological and communication networks feature uncertainty, best modeled using probabilities....