gauche.representations#
Fingerprint Representations#
Contains methods to generate fingerprint representations of molecules, chemical reactions and proteins.
- gauche.representations.fingerprints.drfp(reaction_smiles: List[str], nBits: int | None = 2048) ndarray [source]#
https://github.com/reymond-group/drfp
Builds reaction representation as a binary DRFP fingerprints. :param reaction_smiles: list of reaction smiles :type reaction_smiles: list :return: array of shape [len(reaction_smiles), nBits] with drfp featurised reactions
- gauche.representations.fingerprints.ecfp_fingerprints(smiles: List[str], bond_radius: int | None = 3, nBits: int | None = 2048) ndarray [source]#
Builds molecular representation as a binary ECFP fingerprints.
- Parameters:
- Returns:
array of shape [len(smiles), nBits] with ecfp featurised molecules
- gauche.representations.fingerprints.fragments(smiles: List[str]) ndarray [source]#
Builds molecular representation as a vector of fragment counts.
- Parameters:
smiles (list) – list of molecular smiles
- Returns:
array of shape [len(smiles), 85] with fragment featurised molecules
- gauche.representations.fingerprints.mqn_features(smiles: List[str]) ndarray [source]#
Builds molecular representation as a vector of Molecular Quantum Numbers.
- Parameters:
reaction_smiles (list) – list of molecular smiles
- Returns:
array of mqn featurised molecules
- gauche.representations.fingerprints.one_hot(df: DataFrame) ndarray [source]#
Builds reaction representation as a bit vector which indicates whether a certain condition, reagent, reactant etc. is present in the reaction.
- Parameters:
df (pandas DataFrame) – pandas DataFrame with columns representing different parameters of the reaction (e.g. reactants, reagents, conditions).
- Returns:
array of shape [len(reaction_smiles), sum(unique values for different columns in df)] with one-hot encoding of reactions
- gauche.representations.fingerprints.rxnfp(reaction_smiles: List[str]) ndarray [source]#
https://rxn4chemistry.github.io/rxnfp/
Builds reaction representation as a continuous RXNFP fingerprints. :param reaction_smiles: list of reaction smiles :type reaction_smiles: list :return: array of shape [len(reaction_smiles), 256] with rxnfp featurised reactions
Graph Representations#
Contains methods to generate graph representations of molecules, chemical reactions and proteins.
- gauche.representations.graphs.molecular_graphs(smiles: List[str], graphein_config: bool | None = None) List[Graph] [source]#
Convers a list of SMILES strings into molecular graphs using the feautrisation utilities of graphein.
- Parameters:
smiles (list) – list of molecular SMILES
graphein_config (graphein/config/graphein_config) – graphein configuration object
- Returns:
list of molecular graphs
String Representations#
Contains methods to generate string representations of molecules, chemical reactions and proteins.