Rerun changed tasks
Consider the following pipeline, which we assume lives in the package
example_project
:
import luisy
@luisy.raw
@luisy.csv_output()
class RawFileA(luisy.ExternalTask):
def get_file_name(self):
return 'file_a'
@luisy.raw
@luisy.csv_output()
class RawFileB(luisy.ExternalTask):
def get_file_name(self):
return 'file_b'
@luisy.interim
@luisy.requires(RawFileA)
class InterimFile(luisy.Task):
def run(self):
# some processings
@luisy.final
@luisy.requires(InterimFile, RawFileB)
class FinalFile(luisy.Task):
def run(self):
# some processings
Assume the working-dir looks like this
/projects/example_project/raw/RawFileA.csv
/projects/example_project/raw/RawFileB.csv
and we invoke
luisy --module example_project FinalFile
and afterwards, the following files exist:
/projects/example_project/raw/RawFileA.csv
/projects/example_project/raw/RawFileB.csv
/projects/example_project/interim/InterimFile.csv
/projects/example_project/final/FinalFile.csv
/projects/example_project/.luisy.hash
Now, if the code of of InterimFile
changes and the user runs
luisy --module example_project FinalFile
again, the tasks InterimFile
and FinalFile
are both
executed again.
Note
To just check which tasks would be executed, we can execute
luisy --module example_project FinalFile --dry-run