Tutorial


Input Format

Name SMILES Activation Status
Methanethiol CS 0
Ethanethiol C(C)S 0
Benzenethiol C1(=CC=CC=C1)S 1


Parameters


S.No. Parameter Type What it means
1 n_tasks Integer Number of tasks
2 graph_conv_layers List of integer Width of channels for GCN layers. graph_conv_layers[i] gives the width of channel for the i-th GCN layer
3 activation String The activation function to apply to the output of each GCN layer
4 residual Boolean Whether to add a residual connection within each GCN layer
5 batchnorm Boolean Whether to apply batch normalization to the output of each GCN layer
6 dropout Float The dropout probability for the output of each GCN layer
8 predictor_hidden_feats Integer The size for hidden representations in the output MLP predictor
9 predictor_dropout Integer The dropout probability in the output MLP predictor
10 mode String The model type, ‘classification’ or ‘regression’
11 number_atom_features Integer The length of the initial atom feature vectors
12 n_classes Integer The number of classes to predict per task (only used when mode is ‘classification’)
13 self_loop Boolean Whether to add self loops for the nodes, i.e. edges from nodes to themselves. When input graphs have isolated nodes, self loops allow preserving the original feature of them in message passing.
S.No. Parameter Type What it means
1 n_tasks Integer Number of tasks
2 graph_attention_layers List of integer Width of channels per attention head for GAT layers. graph_attention_layers[i] gives the width of channel for each attention head for the i-th GAT layer. If both graph_attention_layers and agg_modes are specified, they should have equal length
3 n_attention_heads Integer Number of attention heads in each GAT layer
4 residual Boolean Whether to add a residual connection within each GAT layer
5 aggregation_modes List of String The way to aggregate multi-head attention results for each GAT layer, which can be either ‘flatten’ for concatenating all-head results or ‘mean’ for averaging all-head results. agg_modes[i] gives the way to aggregate multi-head attention results for the i-th GAT layer
6 dropout Float The dropout probability for the output of each GAT layer
7 alpha Float A hyperparameter in LeakyReLU, which is the slope for negative values
8 predictor_hidden_feats Integer The size for hidden representations in the output MLP predictor
9 predictor_dropout Integer The dropout probability in the output MLP predictor
10 mode String The model type, ‘classification’ or ‘regression’
11 number_atom_features Integer The length of the initial atom feature vectors
12 n_classes Integer The number of classes to predict per task (only used when mode is ‘classification’)
13 self_loop Boolean Whether to add self loops for the nodes, i.e. edges from nodes to themselves. When input graphs have isolated nodes, self loops allow preserving the original feature of them in message passing.
S.No. Parameter Type What it means
1 n_tasks Integer Number of tasks
2 n_atom_feat Integer Number of features per atom
3 n_graph_feat Integer Number of features for atom in the graph
4 n_outputs Integer Number of features for each molecule
5 layer_sizes List of Integer List of hidden layer size(s) in the propagation step: length of this list represents the number of hidden layers, and each element is the width of corresponding hidden layer
6 dropout Float Dropout probability, applied after each propagation step and gather step
7 layer_sizes_gather List of Integer List of hidden layer size(s) in the gather step
8 n_classes Integer The number of classes to predict (only used in classification mode)
9 uncertainty Boolean If True, include extra outputs and loss terms to enable the uncertainty in outputs to be predicted
10 mode String The model type, ‘classification’ or ‘regression’
S.No. Parameter Type What it means
1 n_tasks Integer Number of tasks
2 num_layers Integer Number of graph neural network layers, i.e. number of rounds of message passing
3 num_timesteps Integer Number of time steps for updating graph representations with a GRU
4 graph_feat_size Int Size for graph representations
5 number_atom_features Integer The length of the initial atom feature vectors
6 dropout Float Dropout probability
7 n_classes Integer The number of classes to predict per task (only used when mode is ‘classification’)
8 self_loop Boolean Whether to add self loops for the nodes, i.e. edges from nodes to themselves. When input graphs have isolated nodes, self loops allow preserving the original feature of them in message passing.


Output Format

SMILES Activation Status Probability of Class 0 Probability of Class 1
CC(CCC=C(C)C)CC=O 1 0.1934605 0.8065395
C1CCCCCC(=O)CCCC1 0 0.9994111 0.0005889
CCC(=O)C1=CC=CC=C1 1 0.4913514 0.5086485

How to run pre-trained models

There are two ways users can use saved models to get the predictions on query data:

1. Users can use GetPrediction_From_Checkpoint_deepGraphh.ipynb in the Get Prediction folder on our GitHub.

2. Users can use the Google Collaboratory Notebook



FAQs



What is the prediction based on?

The prediction is based on Graph-based Deep Learning models.




Can I run two prediction models from different tabs of the same browser?

No.




Can I navigate away from the loading screen?

We understand that it can be a little time-consuming, considering the high computations. Please try to be patient, and do not close the tabs, however, you can change the tabs.




What is the use of Datasets Tab?

deepGraphh has provided example datasets for user’s reference.