In the context of a data pipeline, pipeline scheduling refers to the act of automating parts or all of a data pipeline’s components at fixed times, dates or intervals. Typical frequencies of scheduled runs or ‘jobs’ range from a few minutes to days and well beyond, depending on the purpose of the pipeline and what systems, applications or reports it feeds.
Pipeline scheduling is not to be confused with data streaming which involves a constant, real-time feed of data from one or more sources that passes through the processes specified in the pipeline.
Data Pipelines makes pipeline scheduling easy with our simple schedule tool which allows you to write any pipeline component to any data connection at highly granular intervals. Data Pipelines offers Cron style component scheduling so you can enter either Cron values or use the interface provided.
The video below demonstrates a simple pipeline build and schedule between an AWS S3 bucket and Google Sheets.
Thanks for reading our introduction to pipeline scheduling, we'll update this post with any new features about the platform.