Performance Articles

  • host-wise-latency-to-detect-outliers-and-single-node-failures-this-graph-shows-four-outliers-from-three-hosts

    Rethinking site capacity projections with Capacity Analyzer

    March 16, 2021

    While site outages are inevitable, it’s our job to minimize both the duration of outages and the likelihood for an outage to occur. One of our preemptive measures is in the way we determine overall site capacity and health on an everyday basis, in that we load-test in production. There’s an elegant system to bucket and route members to specific data centers from...

  • diagram-illustrating-memory-allocation-in-glibc-malloc

    Taming memory fragmentation in Venice with Jemalloc

    January 28, 2021

    Sometimes, an engineering problem arises that might make us feel like maybe we don't know what we're doing, or at the very least, forces us out of the comfort zone of our area of expertise. That day came for the Venice team at Linkedin when we began to notice that some Venice processes would consume all available memory and crash if given enough time to run....

  • fixing-linux-file-system-performance-regressions

    Fixing Linux filesystem performance regressions

    October 16, 2020

    As companies grow, adapt, morph, and mature, one item remains the same: the need for reinvention. Technical infrastructure is no exception. As our member community grew, our priorities were to keep up with that growth, or as we say, ensure continuous “site up.” (Read: adding servers to scale from hundreds to hundreds of thousands.) We ran into challenges about...

  • table-comparing-the-nexmark-benchmark-results

    Building a better and faster Beam Samza runner

    October 1, 2020

    Co-authors: Yixing Zhang, Bingfeng Xia, Ke Wu, and Xinyu Liu Since Beam Samza runner was developed in 2018 at LinkedIn, we now have...

  • diagram-showing-hadoop-dual-ingest-pipelines

    Theory vs. Practice: Learnings from a recent Hadoop incident

    August 6, 2020

    Co-authors: Sandhya Ramu and Vasanth Rajamani For companies and organizations, failure tends to be far more illuminating than success...

  • The impact of slow NFS on data systems

    June 23, 2020

    Espresso is LinkedIn's defacto NoSQL database solution. It is an online, distributed, fault-tolerant database that powers most of...