Notes:
A machine learning pipeline is a series of steps or stages that are followed in order to train and deploy a machine learning model. The specific steps in a machine learning pipeline can vary depending on the specific problem being solved and the machine learning algorithms being used, but a typical machine learning pipeline might include the following steps:
- Data collection and preparation: This stage involves collecting and organizing the data that will be used to train the machine learning model. This may include cleaning and preprocessing the data, splitting it into training and validation sets, and generating additional data, if needed.
- Feature engineering: This stage involves selecting and transforming the data features (the variables or attributes that the model will use to make predictions) in order to improve the model’s performance. This may include selecting the most relevant features, scaling or normalizing the data, or applying transformations to extract additional information from the data.
- Model selection and training: This stage involves selecting the machine learning algorithm or algorithms that will be used to build the model, and training the model using the prepared data. This may involve using different algorithms and hyperparameter settings to find the best-performing model, and using techniques such as cross-validation to avoid overfitting.
- Model evaluation: This stage involves evaluating the trained model to determine its accuracy, precision, and other performance metrics. This may involve using the validation data set to calculate the model’s performance, and comparing it to the performance of other models.
- Model deployment: This stage involves deploying the trained model in a production environment, where it can be used to make predictions or take other actions based on new data. This may involve integrating the model into a larger system or application, and ensuring that it is scalable and performant.
Wikipedia:
See also:
100 Best Data Pipeline Videos | 100 Best Deep Learning Tutorial Videos
- 05 Machine Learning Pipeline
- Building a Reproducible Machine Learning Pipeline
- Intelligent pipeline monitoring at Avangrid; leveraging SAP Leonardo’s IoT and Machine Learning
- #CONVERGE NY 2018: Agile Data Pipelines to Enable Machine Learning
- Categorizing Docker Hub’s Public Images: End-to-End Machine Learning Pipeline with Docker Enterprise
- Deploying Machine Learning Pipeline at Scale In the Cloud
- End to End Machine learning pipelines for Python driven organizations – Nick Harvey
- Machine Learning Model Serving and Pipeline Using KNative – Animesh Singh & Tommy Li, IBM
- EECS 495 – Optimization of Machine Learning Project: Sklearn Sentiment Analysis Pipeline
- Machine Learning Pipelines at Google
- AWS re:Invent 2018: CI/CD for Your Machine Learning Pipeline with Amazon SageMaker (DVC303)
- Andrew Ng’s Machine Learning Course | Ceiling Analysis What Part of the Pipeline to Work on Next
- Andrew Ng’s Machine Learning Course | Photo OCR Problem Description and Pipeline
- A Machine Learning Data Pipeline
- Industrial Machine Learning Pipelines with Python & Airflow | PyConHK 2018
- Machine Learning pipeline walk through with Mwenda Mugendi Raini
- Continuous deployment of Machine Learning pipelines – Luca Palmieri & Christos Dimitroulas
- YOW! Perth 2018 – Cameron Joannidis – Building a centralised Machine Learning Pipeline #YOWPerth
- Automating Machine Learning Pipelines for Real Time Scoring (David Crespi)
- Mesosphere DC/OS Machine Learning Pipeline with Jupyterlab and CI/CD
- Hands-on Scikit-learn for Machine Learning: Processing Pipelines with Scikit-learn|packtpub.com
- Alejandro Saucedo – Industrial Machine Learning Pipelines with Python & Airflow
- Executing Open Source Code in Machine Learning Pipelines
- Data Pipelines for Factory IoT and Machine Learning
- Alejandro Saucedo – Industrial Machine Learning Pipelines with Python & Airflow
- Industrial Machine Learning Building Scalable Distributed ML Pipelines
- The Evolving Media Pipeline Aided by Machine Learning | Scott Spector
- End-to-end Machine Learning pipeline – Jan Wiegelmann | M-AI-Summit-2018
- Industrial Machine Learning Building Scalable Distributed ML Pipelines
- Designing data pipelines for analytics and machine learning in industrial settings
- A machine learning and data science pipeline for real companies
- Create Your First Machine Learning Pipeline in ML.NET
- Using Gradient Boosted Trees with Apache SparkML Pipelines – Coursera Advanced Machine Learning
- Mesosphere DC/OS Secure (Kerberos & TLS) Machine Learning Pipeline with Apache Kafka, HDFS and Spark
- Scalable Machine Learning Pipelines for Click Predictions – Moussa Taifi (Appnexus)
- Bringing Your Data Pipeline into The Machine Learning Era – Chris Gaun & Jörg Schad, Mesosphere
- Modernize Your Data Pipeline for the Machine Learning Age
- Lightning Talks: Dan Whitenack: Pachyderm – Machine Learning and Data Pipelines
- Natural Language Identification Machine Learning Pipeline with Python and Scikit-Learn
- Recruit Lifestyle: Building Elastic Machine Learning Pipelines on AWS [Japanese]
- Accelerate the Machine Learning Pipeline on Very Large Datasets
- Accelerate the Machine Learning Pipeline on Very Large Datasets – Aaron Williams
- Intro into Azure Databricks plus Machine Learning Pipelines and Structured Steaming
- Data Pipelines for Factory IoT and Machine Learning
- Extending Spark Machine Learning Pipelines Going beyond wordcount with Spark ML
- DataMass 2017 Jakub Nowacki – Real-Time Machine Learning in Streaming Data Pipelines
- How to Automate Machine Learning Pipeline by Axel De Romblay
- Machine Learning with caret: Building a pipeline
- Jerry Zhu: Debugging the Machine Learning Pipeline
- scale.bythebay.io: Chris Rupley & Till Bergmann, Complex Machine Learning Pipelines Made Easy
- Building Distributed Machine Learning Pipeline – Vojtech Juranek
- PyCon.DE 2017 Alexander Bauer – Large-scale machine learning pipelines using Luigi,…n
- Build your first Machine Learning pipeline using Luigi !!
- How to Build Machine Learning Pipelines in a Breeze with Docker
- Win a Kaggle competition with Apache Spark and SparkML Machine Learning Pipelines
- How we make building machine learning pipelines simple
- Machine Learning Pipeline
- FOSDEM 2017 – Extending Spark Machine Learning Pipelines.mp4
- Let’s Write a Pipeline – Machine Learning Recipes #4
- Designing Machine Learning Pipelines for Mining Transactional SMS Messages: Paul Meinshausen
- Machine Learning Pipelines for IoT platform -1
- MAchine Learning: Understanding the PipeLine from SKlearn
- Music Genre Classifier Pipeline using Machine Learning
- 199 Building Machine Learning Pipelines
- 36 Building Machine Learning Pipelines
- Machine Learning w/the Spark ML Library 5 — Spark Pipelines Contd and Text Processing
- Machine Learning with the Spark ML Library 4 — Spark Pipelines and Automated Cross Validation
- Kevin Goetsch | Deploying Machine Learning using sklearn pipelines
- The Machine Learning Pipeline | Kadenze
- sfspark.org: Peng Ye, Building Machine Learning Pipeline Using Aerosolve
- Machine Learning Pipeline using Luigi and Scikit Learn – PyConSG 2016
- Scalable Machine Learning Pipeline For Meta Data Discovery From eBay Listings
- Optimizing Terascale Machine Learning Pipelines With KeystoneML
- Jose Quesada – A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and cons
- Machine learning pipelines with Spark ML
- Let’s Write a Pipeline – Machine Learning Recipes #4
- A Machine Learning Data Pipeline – PyData SG
- Machine Learning Photo OCR Example 1 Problem Description and Pipeline
- Creating an End-to-End Machine Learning Data Pipeline with Databricks [DEMO] – Spark Summit 2015
- Practical Distributed Machine Learning Pipelines on Hadoop
- PhillyETE2015 #26-Deep Learning Revolution: Rethinking Machine Learning Pipelines – Soumith Chintala
- Building Large Scale Machine Learning Applications with Pipelines – Evan Sparks (UC Berkeley AMPLAB)
- Building, Debugging, and Tuning Spark Machine Learning Pipelines – Joseph Bradley (Databricks)
- Bugra Akyildiz – A Thorough Machine Learning Pipeline via Scikit Learn
- Practical Machine Learning Pipelines with Mllib- Joseph Bradley (Databricks)
- Machine learning W11 4 Ceiling Analysis What Part of the Pipeline to Work on Next
- Machine learning W11 1 Problem Description and Pipeline
- Bugra Akyildiz – A Machine Learning Pipeline with Scikit-Learn