Co-authors: Shu Wang, Biao He, and Minchu Yang At LinkedIn, Apache Spark is our primary compute engine for offline data analytics such as data warehousing, data science, machine learning, A/B testing, and metrics reporting. We execute nearly 100,000 Spark applications daily in our Apache Hadoop YARN (more on how we scaled YARN clusters here). These applications...
infrastructure Articles
-
- Topics:
- Spark,
- infrastructure
-
Co-authors: Michele Ursino and Joe Xue Introduction At LinkedIn, we believe that an opportunity can arise from just one conversation, so having reliable and powerful messaging capabilities to enable people to have those meaningful and professional conversations is crucial. Over the years, we have evolved our messaging platform to meet the needs of our 900...
- Topics:
- infrastructure,
- Architecture,
- Product Design
-
As teams and applications experience growth, it’s critical to adopt architectures that optimize for clear code ownership, build isolation, and provide efficient delivery of code. While many projects start small with just one or two repositories (for example, frontend and backend), this approach often becomes difficult to maintain as the codebases expand. At...
- Topics:
- infrastructure,
- optimization,
- Code,
- serving infrastructure
-
Co-authors: Kenneth Tay and Xiaofeng Wang At Linkedin, we constantly evaluate the value our products and services deliver, so that we...
- Topics:
- infrastructure,
- scale,
- A/B Testing
-
Originally from Argentina, systems & infrastructure engineering leader Federico was a founding member of the Media Infrastructure team...
-
Co-authors: Lily Wittle and Cathy Ji Introduction The Gateway-as-a-Platform (GaaP) team at LinkedIn builds infrastructure to support...
- Topics:
- infrastructure