• isolation-tree

    Detecting and Preventing Abuse on LinkedIn Using Isolation Forests

    August 13, 2019

    The Anti-Abuse AI Team at LinkedIn creates, deploys, and maintains models that detect and prevent various types of abuse, including the creation of fake accounts, member profile scraping, automated spam, and account takeovers. There are several unique challenges we face when using machine learning to prevent abuse on a large professional network, including:...

  • pipeline-cache

    Who Depends On Me? Serving Dependency Queries at Scale

    August 8, 2019

    Co-authors: Yu Li, Szymon Gizecki, and Chinmaya Dattathri To ensure we have significant flexibility in how our teams collaborate, our trunk-based engineering development workflow manages dependencies on a binary level, instead of source level. This requires very efficient and sophisticated management of the resulting dependency graph, and we discussed our...

  • high-level-system-architecture

    Fairness, Privacy, and Transparency by Design in AI/ML Systems

    July 26, 2019

    Co-authors: Stuart Ambler, Ahsan Chudhary, Mark Dietz, Sahin Cem Geyik, Krishnaram Kenthapadi, Ian Koeppe, Varun Mithal, Guillaume Saint-Jacques, Amir Sepehri, Thanh Tran, and Sriram Vasudevan Editor’s note: A shorter version of this article was originally posted by Krishnaram Kenthapadi on LinkedIn. Introduction How do we take fairness and transparency into...

  • LinkedIn_Tech_Update

    Building the next version of our infrastructure

    July 23, 2019

    The pursuit of our mission to connect the world’s professionals to make them more productive and successful is deeply dependent on the...

  • change-data-capture

    Open Sourcing Brooklin: Near Real-Time Data Streaming at...

    July 16, 2019

    Brooklin - a distributed service for streaming data in near real-time and at scale - has been running in production at LinkedIn since...

  • PartitionConsumer-objects-distribution

    Auto-Tuning Pinot Real-Time Consumption

    July 11, 2019

    Pinot, a scalable distributed columnar OLAP data store developed at LinkedIn, delivers real-time analytics for site-facing use cases...