gauche.kernels#
Fingerprint Kernels#
Tanimoto Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.tanimoto_kernel.TanimotoKernel(**kwargs)[source]#
Computes a covariance matrix based on the Tanimoto kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
\[\begin{equation*} k_{\text{Tanimoto}}(\mathbf{x}, \mathbf{x'}) = \frac{\langle\mathbf{x}, \mathbf{x'}\rangle}{\left\lVert\mathbf{x}\right\rVert^2 + \left\lVert\mathbf{x'}\right\rVert^2 - \langle\mathbf{x}, \mathbf{x'}\rangle} \end{equation*}\]Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(TanimotoKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(TanimotoKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.tanimoto_kernel.batch_tanimoto_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
Tanimoto similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added. Tanimoto similarity is proportional to:
\((<x, y>) / (||x||^2 + ||y||^2 - <x, y>)\)
where x and y may be bit or count vectors or in set notation:
\(|A \cap B| / |A| + |B| - |A \cap B|\)
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the Tanimoto similarity.
Braun-Blanquet Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.braun_blanquet_kernel.BraunBlanquetKernel(**kwargs)[source]#
Computes a covariance matrix based on the Braun-Blanquet kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(BraunBlanquetKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(BraunBlanquetKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.braun_blanquet_kernel.batch_braun_blanquet_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
Braun-Blanquet similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added.
\(<x1, x2> / max(|x1|, |x2|)\)
Where || is the L1 norm and <.> is the inner product
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the Braun-Blanquet similarity.
Dice Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.dice_kernel.DiceKernel(**kwargs)[source]#
Computes a covariance matrix based on the Dice kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
\[\begin{equation*} k_{\text{Dice}}(\mathbf{x}, \mathbf{x'}) = \frac{2\langle\mathbf{x}, \mathbf{x'}\rangle}{\left\lVert\mathbf{x}\right\rVert + \left\lVert\mathbf{x'}\right\rVert} \end{equation*}\]Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(DiceKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(DiceKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.dice_kernel.batch_dice_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
Dice similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added.
\((2 * <x1, x2>) / (|x1| + |x2|)\)
Where || is the L1 norm and <.> is the inner product
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the Dice similarity.
Faith Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.faith_kernel.FaithKernel(**kwargs)[source]#
Computes a covariance matrix based on the Faith kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(FaithKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(FaithKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.faith_kernel.batch_faith_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
Faith similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added.
\((2 * <x1, x2>) + d / 2n\)
Where <.> is the inner product, d is the number of common zeros and n is the dimension of the input vectors
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the Faith similarity.
Forbes Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.forbes_kernel.ForbesKernel(**kwargs)[source]#
Computes a covariance matrix based on the Forbes kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(ForbesKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(ForbesKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.forbes_kernel.batch_forbes_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
Forbes similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added.
\(n * <x1, x2> / (|x1| + |x2|)\)
Where <.> is the inner product, \(||\) is the L1 norm, and n is the dimension of the input vectors
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the Forbes similarity.
Inner Product Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.inner_product_kernel.InnerProductKernel(**kwargs)[source]#
Computes a covariance matrix based on the inner product kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
\[\begin{equation*} k_{\text{Inner Product}}(\mathbf{x}, \mathbf{x'}) = \langle\mathbf{x}, \mathbf{x'}\rangle \end{equation*}\]Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(InnerProductKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(InnerProductKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.inner_product_kernel.batch_inner_product_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
Inner product similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added.
\(<x1, x2>\)
Where <.> is the inner product
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the inner product similarity.
Intersection Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.intersection_kernel.IntersectionKernel(**kwargs)[source]#
Computes a covariance matrix based on the Intersection kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(IntersectionKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(IntersectionKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.intersection_kernel.batch_intersection_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
Intersection similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added. Must be used with binary-valued vectors only
\(<x1, x2> + <x1', x2'>\)
Where <.> is the inner product and x1’ and x2’ denote the bit flipped vectors such that ones and zeros are interchanged
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the Intersection similarity.
MinMax Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.minmax_kernel.MinMaxKernel(**kwargs)[source]#
Computes a covariance matrix based on the MinMax kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
\[\begin{equation*} k_{\text{MinMax}}(\mathbf{x}, \mathbf{x'}) = \frac{\sum_i \min(x_i, x'_i)} \end{equation*}\]Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(MinMaxKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(MinMaxKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.minmax_kernel.batch_minmax_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
MinMax similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added.
\((|x1| + |x2| - |x1 - x2|) / (|x1| + |x2| + |x1 - x2|)\)
Where \(||\) is the L1 norm
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the MinMax similarity.
Otsuka Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.otsuka_kernel.OtsukaKernel(**kwargs)[source]#
Computes a covariance matrix based on the Otsuka kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(OtsukaKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(OtsukaKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.otsuka_kernel.batch_otsuka_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
Otsuka similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added.
\(<x1, x2> / sqrt(|x1| + |x2|)\)
Where || is the L1 norm and <.> is the inner product
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the Otsuka similarity.
Rand Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.rand_kernel.RandKernel(**kwargs)[source]#
Computes a covariance matrix based on the Rand kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(RandKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(RandKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.rand_kernel.batch_rand_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
Rand similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added.
\(<x1, x2> + d / n\)
Where <.> is the inner product, d is the number of common zeros and n is the dimensionality
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the Rand similarity.
Rogers-Tanimoto Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.rogers_tanimoto_kernel.RogersTanimotoKernel(**kwargs)[source]#
Computes a covariance matrix based on the Rogers-Tanimoto kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(RogersTanimotoKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(RogersTanimotoKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.rogers_tanimoto_kernel.batch_rogers_tanimoto_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
Rogers-Tanimoto similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added.
\(<x1, x2> + d / 2|x1| + 2|x2| - 3*<x1, x2> + d\)
Where || is the L1 norm and <.> is the inner product and d is the number of common zeros
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the Otsuka similarity.
Russell-Rao Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.russell_rao_kernel.RussellRaoKernel(**kwargs)[source]#
Computes a covariance matrix based on the Russell and Rao kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(RusselRaoKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(RusselRaoKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.russell_rao_kernel.batch_russell_rao_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
Russell-Rao similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added.
\(<x1, x2> / n\)
Where <.> is the inner product and n is the dimension of the vectors x1/x2
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the Russell and Rao similarity.
Sogenfrei Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.sogenfrei_kernel.SogenfreiKernel(**kwargs)[source]#
Computes a covariance matrix based on the Sogenfrei kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(SogenfreiKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(SogenfreiKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.sogenfrei_kernel.batch_sogenfrei_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
Sogenfrei similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added.
\(<x1, x2>**2 / (|x1| + |x2|)\)
Where <.> is the inner product and || is the L1 norm
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the Sogenfrei similarity.
Sokal-Sneath Kernel. Operates on representations including bit vectors e.g. Morgan/ECFP6 fingerprints count vectors e.g. RDKit fragment features.
- class gauche.kernels.fingerprint_kernels.sokal_sneath_kernel.SokalSneathKernel(**kwargs)[source]#
Computes a covariance matrix based on the Sokal-Sneath kernel between inputs \(\mathbf{x_1}\) and \(\mathbf{x_2}\):
Note
This kernel does not have an outputscale parameter. To add a scaling parameter, decorate this kernel with a
gpytorch.test_kernels.ScaleKernel
.- Example:
>>> x = torch.randint(0, 2, (10, 5)) >>> # Non-batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(SokalSneathKernel()) >>> covar = covar_module(x) # Output: LazyTensor of size (10 x 10) >>> >>> batch_x = torch.randint(0, 2, (2, 10, 5)) >>> # Batch: Simple option >>> covar_module = gpytorch.kernels.ScaleKernel(SokalSneathKernel()) >>> covar = covar_module(batch_x) # Output: LazyTensor of size (2 x 10 x 10)
- __init__(**kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- covar_dist(x1, x2, last_dim_is_batch=False, **params)[source]#
This is a helper method for computing the bit vector similarity between all pairs of points in x1 and x2.
:param
x1
: First set of data. :typex1
: Tensor n x d or b1 x … x bk x n x d :paramx2
: Second set of data. :typex2
: Tensor m x d or b1 x … x bk x m x d :paramlast_dim_is_batch
: Is the last dimension of the data a batch dimension or not? :typelast_dim_is_batch
: tuple, optional- Returns:
(
Tensor
,Tensor) corresponding to the distance matrix between `x1
and x2. The shape depends on the kernel’s mode * diag=False * diag=False and last_dim_is_batch=True: (b x d x n x n) * diag=True * diag=True and last_dim_is_batch=True: (b x d x n)
- forward(x1, x2, diag=False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N
- gauche.kernels.fingerprint_kernels.sokal_sneath_kernel.batch_sokal_sneath_sim(x1: Tensor, x2: Tensor, eps: float = 1e-06) Tensor [source]#
Sokal-Sneath similarity between two batched tensors, across last 2 dimensions. eps argument ensures numerical stability if all zero tensors are added.
\(<x1, x2> / 2|x1| + 2|x2| - 3*<x1, x2>\)
Where <.> is the inner product and || is the L1 norm
- Parameters:
x1 – [b x n x d] Tensor where b is the batch dimension
x2 – [b x m x d] Tensor
eps – Float for numerical stability. Default value is 1e-6
- Returns:
Tensor denoting the Sokal-Sneath similarity.
Graph Kernels#
- class gauche.kernels.graph_kernels.EdgeHistogramKernel(edge_label, dtype=torch.float32)[source]#
A GraKel wrapper for the edge histogram kernel. This kernel requires edge labels to be specified.
See https://ysig.github.io/GraKeL/0.1a8/kernels/edge_histogram.html for more details.
- class gauche.kernels.graph_kernels.GraphletSamplingKernel(dtype=torch.float32)[source]#
A GraKel wrapper for the graphlet sampling kernel. This kernel only works on unlabelled graphs.
See https://ysig.github.io/GraKeL/0.1a8/kernels/graphlet_sampling.html for more details.
- class gauche.kernels.graph_kernels.NeighborhoodHashKernel(node_label: str, dtype=torch.float32)[source]#
A GraKel wrapper for the neighborhood hash kernel. This kernel requires node labels to be specified.
See https://ysig.github.io/GraKeL/0.1a8/kernels/neighborhood_hash.html for more details.
- class gauche.kernels.graph_kernels.RandomWalkKernel(dtype=torch.float32)[source]#
A GraKel wrapper for the random walk kernel. This kernel only works on unlabelled graphs. See RandomWalkLabeledKernel for labelled graphs.
See https://ysig.github.io/GraKeL/0.1a8/kernels/random_walk.html for more details.
- class gauche.kernels.graph_kernels.RandomWalkLabeledKernel(node_label: str, dtype=torch.float32)[source]#
A GraKel wrapper for the random walk kernel. This kernel requires node labels to be specified.
See https://ysig.github.io/GraKeL/0.1a8/kernels/random_walk.html for more details.
- class gauche.kernels.graph_kernels.ShortestPathKernel(dtype=torch.float32)[source]#
A GraKel wrapper for the shortest path kernel. This kernel only works on unlabelled graphs. See ShortestPathLabeledKernel for labelled graphs.
See https://ysig.github.io/GraKeL/0.1a8/kernels/shortest_path.html for more details.
- class gauche.kernels.graph_kernels.ShortestPathLabeledKernel(node_label: str, dtype=torch.float32)[source]#
A GraKel wrapper for the shortest path kernel. This kernel requires node labels to be specified.
See https://ysig.github.io/GraKeL/0.1a8/kernels/shortest_path.html for more details.
- class gauche.kernels.graph_kernels.VertexHistogramKernel(node_label: str, dtype=torch.float32)[source]#
A GraKel wrapper for the vertex histogram kernel. This kernel requires node labels to be specified.
See https://ysig.github.io/GraKeL/0.1a8/kernels/vertex_histogram.html for more details.
- class gauche.kernels.graph_kernels.WeisfeilerLehmanKernel(node_label: str, edge_label: str | None = None, dtype=torch.float32)[source]#
A GraKel wrapper for the Weisfeiler-Lehman kernel. This kernel needs node labels to be specified and can optionally use edge labels for the base kernel.
See https://ysig.github.io/GraKeL/0.1a8/kernels/weisfeiler_lehman.html for more details.
String Kernels#
- class gauche.kernels.string_kernels.SubsequenceStringKernel(embds, index, alphabet=[], maxlen=80, batch_size=1000, _gap_decay=0.5, _match_decay=0.2, _order_coefs=[1.0, 0.5, 0.25, 0.125, 0.0625], normalize=True, **kwargs)[source]#
- __init__(embds, index, alphabet=[], maxlen=80, batch_size=1000, _gap_decay=0.5, _match_decay=0.2, _order_coefs=[1.0, 0.5, 0.25, 0.125, 0.0625], normalize=True, **kwargs)[source]#
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(X1: Tensor, X2: Tensor, diag: bool = False, **params)[source]#
Computes the covariance between \(\mathbf x_1\) and \(\mathbf x_2\). This method should be imlemented by all Kernel subclasses.
- Parameters:
x1 – First set of data (… x N x D).
x2 – Second set of data (… x M x D).
diag – Should the Kernel compute the whole kernel, or just the diag? If True, it must be the case that x1 == x2. (Default: False.)
last_dim_is_batch – If True, treat the last dimension of x1 and x2 as another batch dimension. (Useful for additive structure over the dimensions). (Default: False.)
- Returns:
The kernel matrix or vector. The shape depends on the kernel’s evaluation mode:
full_covar: … x N x M
full_covar with last_dim_is_batch=True: … x K x N x M
diag: … x N
diag with last_dim_is_batch=True: … x K x N