Skip to content

Security & Compliance

Scribble is committed to security of the data. It is achieved through a combination of industry best practices followed by Scribble and Enrich, and also governed by the security policies of the client. Scribble operates transparently and in collaboration with the clients to achieve the security and privacy goals.

A few observations about the service:

  1. Private deployment. The Enrich data preparation platform operates behind the firewall of customers on machines that are not accessible form public internet.
  2. Restricted access. It is not a general purpose system that is accessible to end-customers or all employees of the client organization. The client determines who has access to data, how, and why. Typically the access is restricted to ML engineers.
  3. The service has a number of security enhancing features: (a) All accesses use SSL/https (b) All accesses require login except APi accesses via API Key. (c) All credentials are obfuscated or encrypted as needed on the server (d) Most user activity is logged (e) Downloads of data are encrypted by default, and can be disabled (f) Upgrades are allowed only by users with 'staff' permissions. Typically only one person at the client has staff permission. (g) Security enhacing data prep modules such as libraries and transforms. (h) Metadata and admin interface have elements to handle sensitive data
  4. Enrich has a compliance 'app' to help with compliance tasks outside Enrich itself.
  5. Scribble is often tasked to generate the datasets as a service. Server is also upgraded regularly with new features due the fast moving space. As a result, Scribble usually has access to server with Enrich. As a security conscious company, we take a number of precautions to ensure security and compliance aspects:

    1. Only one staff member has access to the SSH keys - usually the CEO. Keys and other configuration files are stored automatically encrypted using PGP and stored in Github. They are unlocked only when machine access is required.
    2. Laptops of staff use encrypted filesystems
    3. Any access follows a protocol developed in conjunction with the client including the use of jump servers and over VPN.
    4. Data is often required for local testing of the data prep pipelines. A cap is placed on data transferred (say 0.01% of volume) and/or coordinated with the client. In most cases, we request anonymized data for the data prep. Most data prep doesnt require the PII information.
    5. The client, if required, can perform a full code review of the Enrich server at any point in time.
    6. All client-specific data preparation modules are completely separated from the Enrich codebase and stored in client's github repository only. None of the Enrich modules are client-specific.

Depending on the client, the details of the compliance vary. A very detailed compliance checklist can be found in the resources page of the server.