Data Articles

  • venn-diagram-showing-overlap-of-three-characteristics

    Solving for the cardinality of set intersection at scale with Pinot and Theta Sketches

    April 16, 2021

    Co-authors: Vincent Wang, Siddharth Teotia, Manoj Thakur, and Mayank Shrivastava As our LinkedIn Marketing Solutions Blog recently noted, companies and marketers “are once again peering ahead, setting their plans for success in a reshaped business environment.” One of the items businesses rely on to do this are insights, including the estimated reach of an...

  • diagram-showing-cycle-of-bias-reinforcement-over-time

    Using the LinkedIn Fairness Toolkit in large-scale AI systems

    February 8, 2021

    Co-authors: Preetam Nandy, Yunsong Meng, Cyrus DiCiccio, Heloise Logan, Amir Sepehri, Divya Venugopalan, Kinjal Basu, and Noureddine El Karoui Introduction LinkedIn’s vision to create economic opportunity for every member of the global workforce would be impossible to realize without leveraging AI at scale. We use AI in our core product offerings to: highlight...

  • illustration-of-budget-divided-among-groups

    Budget-split testing: A trustworthy and powerful approach to marketplace A/B testing

    January 21, 2021

    Co-authors: Min Liu, Vangelis Dimopoulos, Elise Georis, Jialiang Mao, Di Luo, and Kang Kang The LinkedIn ecosystem drives member and customer value through a series of marketplaces (e.g., the ads marketplace, the talent marketplace, etc.). We maximize that value by making data-informed product decisions via A/B testing. Traditional A/B tests on our marketplaces,...

  • FastIngest: Low-latency Gobblin with Apache Iceberg and...

    January 6, 2021

    Co-authors: Zihan Li, Sudarshan Vasudevan, Lei Sun, and Shirshanka Das Data analytics and AI power many business-critical use cases at...

  • coral-a-sql-translation-analysis-and-rewrite-engine

    Coral: A SQL translation, analysis, and rewrite engine for ...

    December 10, 2020

    Co-authors: Walaa Eldin Moustafa, Wenye Zhang, Sushant Raikar, Raymond Lam, Ron Hu, Shardul Mahadik, Laura Chen, Khai Tran, Chris Chen...

  • explaining-metadata-architectures

    DataHub: Popular metadata architectures explained

    December 7, 2020

    When I started my journey at LinkedIn ten years ago, the company was just beginning to experience extreme growth in the volume,...