We are open sourcing Feathr – the feature store we built to simplify machine learning (ML) feature management and improve developer productivity. At LinkedIn, dozens of applications use Feathr to define features, compute them for training, deploy them in production, and share them across teams. With Feathr, users reported significantly reduced time required to...
Open Source Articles
-
At LinkedIn, Apache Kafka is used heavily to store all kinds of data, such as member activity, log storage, metrics storage, and a multitude of inter-service messaging. LinkedIn maintains multiple data centers with multiple Kafka clusters per data center, each of which contains an independent set of data. Mirroring (i.e., replicating) Kafka topics across the...
- Topics:
- Pinot,
- Kafka,
- Open Source
-
Co-authors: Jilei Yang, Humberto Gonzalez, Parvez Ahammad In this blog post, we introduce and announce the open sourcing of the FastTreeSHAP package, a Python package based on the paper Fast TreeSHAP: Accelerating SHAP Value Computation for Trees (presented at the NeurIPS2021 XAI4Debugging Workshop). FastTreeSHAP enables an efficient interpretation of tree-based...
-
Co-authors: Jaewon Yang, Minji Yoon, Sufeng Niu, Dash Shi, and Qi He Graphs are a universal way to represent relationships among...
-
Co-authors: Venkata Krishnan Sowrirajan and Min Shen We are excited to announce that push-based shuffle (codenamed Project Magnet) is...
- Topics:
- Spark,
- Open Source
-
Co-authors: Keqiu Hu, Jonathan Hung, Haibo Chen, and Sriram Rao At LinkedIn, we use Hadoop as our backbone for big data analytics and...
- Topics:
- Hadoop,
- Data,
- Open Source