Running the Pipeline

Import data sources:

crowdastro import_data

This by default uses the SWIRE catalogue. For using the WISE catalogue:

crowdastro import_data --ir wise

Process the consensuses:

crowdastro consensuses

Generate the training data:

crowdastro generate_training_data

The training data contains raw_features and labels. Note that image features have not been processed — these are raw pixels.

Generate the training and testing sets:

crowdastro generate_test_sets

Generate a model:

crowdastro compile_cnn

Train the CNN:

crowdastro train_cnn

Generate the CNN outputs:

crowdastro generate_cnn_outputs

This adds the features dataset to the training HDF5 file.

The experiments in crowdastro.experiment can now be run as command-line scripts:

python3 -m crowdastro.experiment.experiment_name

The files generated by the pipeline can also be used directly to train and test classifiers. See the files in the crowdastro.experiment module for examples.