Applied Research & Productivity
Applied Research & Productivity
Our mission is to push the boundaries on what we can do with our data, building and incubating new capabilities for insights and enabling data-driven decisions. Our applied research team does cutting edge research in multiple areas including AI/ML, computational social science, experimentation, privacy and time series/forecasting. Our strategic reporting team identifies and communicates business insights leveraging all our data, including building new data visualizations to more efficiently communicate complex ideas and insights. Our productivity team builds the data tools that support the broader data org and beyond, making sure all capabilities are scalable and easy to use for all of LinkedIn. While our incubation team currently is focused on building out data science capabilities for our various infrastructure teams to help them plan and maintain LinkedIn’s infrastructure.
Using Ego-Clusters to Measure Network Effects at LinkedIn
This paper outlines a simple and scalable solution to measuring network effects, using ego-network randomization, where a cluster is comprised of an "ego" (a focal individual), and her "alters" (the individuals she is immediately connected to).
PriPeARL: A Framework for Privacy-Preserving Analytics and Reporting at LinkedIn
Preserving privacy of users is a key requirement of web-scale analytics and reporting applications, and has witnessed a renewed focus in light of recent data breaches and new regulations. We focus on the problem of computing robust, reliable analytics in a privacy-preserving manner, while satisfying product requirements.
Practical Differentially Private Top-k Selection with Pay-What-You-Get Composition
Differential privacy has become the gold standard for rigorous privacy guarantees in data analytics. One of the primary benefits of differential privacy is that the privacy loss of a computation on a dataset can be quantified. For this work, we hope to extend the use of differential privacy in practical systems to allow analysts to compute the K most frequent elements in a given dataset.
This Week in Machine Learning & AI with Burcu Baran
In this episode of the TWiML&AI Strata Data conference series, Burcu Baran, senior data scientist, outlines how we manage our machine learning production process.
Detecting Interference: An A/B Test of A/B Tests
Validity of randomized experiments rests on the “stable unit treatment value assumption” (SUTVA), which tends to be violated with network effects. We leverage a new experimental design for testing whether SUTVA holds, without making any assumptions on how treatment effects may spill over between the treatment and the control group.
SQR: Balancing Speed, Quality and Risk in Online Experiments
In this paper, we build up a ramping framework that can effectively balance among Speed, Quality and Risk (SQR). We start out by identifying the top common mistakes experimenters make, and then introduce the four SQR principles corresponding to the four ramp phases of an experiment.
Influence Maximization with Spontaneous User Adoption
In this paper, we studied the realistic scenario of spontaneous user adoption into influence propagation (also referred to as self-activation), proposed the self-activation independent cascade (SAIC) model, and a related study to influence maximization problems.