In recent years, we’ve been fortunate to see a growing number of excellent machine learning tools, such as TensorFlow, PyTorch, DeepLearning4J, and CNTK for neural networks, Spark and Kubeflow for very-large-scale pipelines, and scikit-learn, ML.NET, and the recent Tribuo for a wide variety of common models. However, models are typically part of an integrated...

Posts by Jeff Pasternack
-
- Topics:
- machine learning,
- artificial intelligence,
- Data
-
MiGz for Compression and Decompression
Jeff Pasternack February 20, 2019
Compressing and decompressing files with GZip normally uses a single thread. For large files, this can bottleneck dependent tasks like data processing, data analysis, and machine learning. Although there are several alternatives supporting multithreaded compression, such as pigz (command-line tool) and ParallelGZip (Java library), no Java library (for any...
- Topics:
- Open Source