Spark Parallel Job Execution
A pretty common use case for Spark is to run many jobs in parallel. Spark is excellent at running stages in parallel after constructing the job dag, but this doesn’t help us to run two entirely independent jobs in the same Spark applciation at the same time. Some of the use cases I can think of for parallel job execution include steps in an etl pipeline in which we are pulling data from several remote sources and landing them into our an hdfs cluster.