Edit on GitHub

model_signing

Universal model signing library.

The API is split into 3 main components (and a glue model_signing.manifest module for data types used in public interfaces):

  • model_signing.hashing: responsible with generating a list of hashes for every component of the model. A component could be a file, a file shard, a tensor, etc., depending on the method used. We currently support only files and file shards. The result of hashing is a manifest, a listing of hashes for every object in the model.
  • model_signing.signing: responsible with taking the manifest and generating a signature, based on a signing configuration. The signing configuration can select the method used to sign as well as the parameters.
  • model_signing.verifying: responsible with taking a signature and verifying it. If the cryptographic parts of the signature can be validated, the verification layer would return an expanded manifest which can then be compared agains a manifest obtained from hashing the existing model. If the two manifest don't match then the model integrity was compromised and the model_signing package detected that.

The first two of these components allows configurability but can also be used directly, with a default configuration. The only difference is for the verification component where we need to configure the verification method since there are no sensible defaults that can be used.

Signing can be done using the default configuration:

model_signing.signing.sign("finbert", "finbert.sig")

This example generates the signature using Sigstore.

Alternatively, a custom configuration can be selected, for both signing and hashing:

model_signing.signing.Config().use_elliptic_key_signer(
    private_key="key"
).set_hashing_config(
    model_signing.hashing.Config().set_ignored_paths(
        paths=["README.md"], ignore_git_paths=True
    )
).sign("finbert", "finbert.sig")

This example generates a signature using a private key based on elliptic curve cryptography. It also hashes the model by ignoring README.md and any git related file present in the model directory.

We also support signing with signing certificates, using a similar API as above.

When verifying, we need to configure the cryptography configuration, so that the code knows how to parse the signature.

For the Sigstore example, the simplest verification example would be:

model_signing.verifying.Config().use_sigstore_verifier(
    identity=identity, oidc_issuer=oidc_provider
).verify("finbert", "finbert.sig")

Where identity and oidc_provider are the parameters obtained after the OIDC flow during signing.

To verify the private key example, we could use the following:

model_signing.verifying.Config().use_elliptic_key_verifier(
    public_key="key.pub"
).set_hashing_config(
    model_signing.hashing.Config().use_shard_serialization()
    )
).verify("finbert", "finbert.sig")

Alternatively, we also support automatic detection of the hashing configuration during the verification process. So, the following should also work:

model_signing.verifying.Config().use_elliptic_key_verifier(
    public_key="key.pub"
).verify("finbert", "finbert.sig")

A reminder that we still need to set the verification configuration. This sets up the cryptographic primitives to verify the signature and is needed to know how to parse the signature file.

For any signing method, the signature is a Sigstore bundle which contains the verification material (the information needed to verify the signature) and the payload. The verification material depends on the method used for signing.

The payload in the signature is a DSSE envelope which contains an in-toto statement. The in-toto statement contains the actual metadata that gets signed, and in our case is a custom predicate that identifies all the components of the model.

Read more on the repository's README.md. The CLI that maps over the API is also documented there.

  1# Copyright 2024 The Sigstore Authors
  2#
  3# Licensed under the Apache License, Version 2.0 (the "License");
  4# you may not use this file except in compliance with the License.
  5# You may obtain a copy of the License at
  6#
  7#      http://www.apache.org/licenses/LICENSE-2.0
  8#
  9# Unless required by applicable law or agreed to in writing, software
 10# distributed under the License is distributed on an "AS IS" BASIS,
 11# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 12# See the License for the specific language governing permissions and
 13# limitations under the License.
 14
 15"""Universal model signing library.
 16
 17The API is split into 3 main components (and a glue `model_signing.manifest`
 18module for data types used in public interfaces):
 19
 20- `model_signing.hashing`: responsible with generating a list of hashes for
 21  every component of the model. A component could be a file, a file shard, a
 22  tensor, etc., depending on the method used. We currently support only files
 23  and file shards. The result of hashing is a manifest, a listing of hashes for
 24  every object in the model.
 25- `model_signing.signing`: responsible with taking the manifest and generating a
 26  signature, based on a signing configuration. The signing configuration can
 27  select the method used to sign as well as the parameters.
 28- `model_signing.verifying`: responsible with taking a signature and verifying
 29  it. If the cryptographic parts of the signature can be validated, the
 30  verification layer would return an expanded manifest which can then be
 31  compared agains a manifest obtained from hashing the existing model. If the
 32  two manifest don't match then the model integrity was compromised and the
 33  `model_signing` package detected that.
 34
 35The first two of these components allows configurability but can also be used
 36directly, with a default configuration. The only difference is for the
 37verification component where we need to configure the verification method since
 38there are no sensible defaults that can be used.
 39
 40Signing can be done using the default configuration:
 41
 42```python
 43model_signing.signing.sign("finbert", "finbert.sig")
 44```
 45
 46This example generates the signature using Sigstore.
 47
 48Alternatively, a custom configuration can be selected, for both signing and
 49hashing:
 50
 51```python
 52model_signing.signing.Config().use_elliptic_key_signer(
 53    private_key="key"
 54).set_hashing_config(
 55    model_signing.hashing.Config().set_ignored_paths(
 56        paths=["README.md"], ignore_git_paths=True
 57    )
 58).sign("finbert", "finbert.sig")
 59```
 60
 61This example generates a signature using a private key based on elliptic curve
 62cryptography. It also hashes the model by ignoring `README.md` and any git
 63related file present in the model directory.
 64
 65We also support signing with signing certificates, using a similar API as above.
 66
 67When verifying, we need to configure the cryptography configuration, so that the
 68code knows how to parse the signature.
 69
 70For the Sigstore example, the simplest verification example would be:
 71
 72```python
 73model_signing.verifying.Config().use_sigstore_verifier(
 74    identity=identity, oidc_issuer=oidc_provider
 75).verify("finbert", "finbert.sig")
 76```
 77
 78Where `identity` and `oidc_provider` are the parameters obtained after the OIDC
 79flow during signing.
 80
 81To verify the private key example, we could use the following:
 82
 83```python
 84model_signing.verifying.Config().use_elliptic_key_verifier(
 85    public_key="key.pub"
 86).set_hashing_config(
 87    model_signing.hashing.Config().use_shard_serialization()
 88    )
 89).verify("finbert", "finbert.sig")
 90```
 91
 92Alternatively, we also support automatic detection of the hashing configuration
 93during the verification process. So, the following should also work:
 94
 95```python
 96model_signing.verifying.Config().use_elliptic_key_verifier(
 97    public_key="key.pub"
 98).verify("finbert", "finbert.sig")
 99```
100
101A reminder that we still need to set the verification configuration. This sets
102up the cryptographic primitives to verify the signature and is needed to know
103how to parse the signature file.
104
105For any signing method, the signature is a
106[Sigstore bundle](https://docs.sigstore.dev/about/bundle/) which contains the
107verification material (the information needed to verify the signature) and the
108payload. The verification material depends on the method used for signing.
109
110The payload in the signature is a
111[DSSE envelope](https://github.com/secure-systems-lab/dsse) which contains an
112[in-toto statement](https://github.com/in-toto/attestation/tree/main/spec/v1).
113The in-toto statement contains the actual metadata that gets signed, and in our
114case is a custom predicate that identifies all the components of the model.
115
116Read more [on the repository's `README.md`][repo]. The CLI that maps over the
117API is also documented there.
118
119[repo]: https://github.com/sigstore/model-transparency/blob/main/README.md
120"""
121
122from model_signing import hashing
123from model_signing import manifest
124from model_signing import signing
125from model_signing import verifying
126
127
128__version__ = "1.0.0"
129
130
131__all__ = ["hashing", "signing", "verifying", "manifest"]