Skip to content

Installation

This python-based SDK is still in experimental mode. The distribution will be freely available once the SDK moves to general availability stage.

Supported Platforms

  1. Ubuntu 20.04 or above (Other linux distributions should work as well but you have to hack)
  2. Python 3.8 or above only
  3. Cloud Platforms: AWS, GCP, and Azure (minimal)
  4. On-prem services: VMWare VSphere, RHEL 8.7+

Download

EnrichSDK is distributed via the pypi service. First make sure that the OS dependencies are taken care of.

$ sudo apt-get update
$ sudo apt-get install python3.8-dev python3.8-venv # or any other verison of python3
$ workon scribble # this or next line
$ python3 -m venv venv

Then, install:

$ pip3 install wheel enrichsdk

Contents

The SDK is a typical python package. It has a growing set of modules. Check with Scribble on the support :

enrichsdk
├── app
│   ├── bases
│   │   ├── __init__.py
│   │   ├── policyapp
│   │   │   ├── forms.py
│   │   │   ├── __init__.py
│   │   │   ├── models.py
│   │   │   └── views.py
│   │   └── singlepageapp
│   │       ├── forms.py
│   │       ├── __init__.py
│   │       ├── models.py
│   │       └── views.py
│   ├── __init__.py
│   └── utils.py
├── contrib
│   ├── __init__.py
│   ├── lib
│   │   ├── assets
│   │   │   ├── anonymizer.py
│   │   │   ├── changepoints.py
│   │   │   ├── __init__.py
│   │   │   ├── llmtextgen.py
│   │   │   ├── profilespec.py
│   │   │   └── syndata.py
│   │   ├── catalog.py
│   │   ├── __init__.py
│   │   ├── logprocessor.py
│   │   └── transforms
│   │       ├── anomalies
│   │       ├── changepoints
│   │       ├── data_quality
│   │       ├── feature_compute
│   │       ├── filebased_query_executor
│   │       │   ├── __init__.py
│   │       │   └── lib.py
│   │       ├── fileops
│   │       ├── inmemory_query_executor
│   │       ├── metrics
│   │       ├── notebook_executor
│   │       ├── observability
│   │       └── synthetic_data_generator
│   └── transforms
│       ├── fileops
│       ├── jsonsink
│       ├── jsonsource
│       ├── pqexport
│       ├── sqlexport
│       ├── tablesink
│       └── tablesource
├── core
│   ├── frames.py
│   ├── __init__.py
│   ├── mixins.py
│   ├── node.py
│   ├── patch.py
│   ├── render.py
│   ├── res.py
│   ├── run.py
│   ├── sdk.py
│   ├── state.py
│   └── widgets.py
├── datasets
│   ├── discover.py
│   ├── doodle.py
│   ├── generators.py
│   └── __init__.py
├── extractors
├── feature_compute
├── featurestore
│   └── schema.py
├── lib
│   ├── context.py
│   ├── customer.py
│   ├── exceptions.py
│   ├── integration.py
│   ├── misc.py
│   ├── permissions.py
│   ├── renderlib.py
│   └── resources.py
├── quality
│   ├── base.py
│   ├── exceptions.py
│   ├── expectations.py
│   ├── __init__.py
│   ├── reconciliation.py
│   └── transforms.py
├── scripts
│   ├── enrichpkg.py
│   └── __init__.py
├── tasks
│   ├── dummy_task
│   │   └── __init__.py
│   ├── __init__.py
│   └── sdk.py
├── templates
│   ├── airflow
│   │   └── contrib-pipeline-v1.py
│   ├── assets
│   │   └── datasets.py
│   ├── dashboard
│   │   ├── apps.py
│   │   ├── catalog.py
│   │   ├── custom.py
│   │   ├── __init__.py
│   │   ├── persona.py
│   │   ├── tasks.py
│   │   ├── urls.py
│   │   └── views.py
│   ├── metrics
│   │   ├── __init__.py
│   │   └── profilespec.py
│   ├── prefect
│   │   └── default.py
│   └── spark
│       ├── dependencies
│       │   ├── enrich.py
│       │   ├── __init__.py
│       │   ├── logging.py
│       │   └── spark.py
│       └── jobs
│           └── run_spark.py
└── utils
    ├── excel.py
    ├── __init__.py
    ├── redis.py
    └── sample.py

Check Installation

After creating a virtual environment and installing the SDK, you will be presented a enrich command called enrichpkg. It supports several actions:

$ enrichpkg
Usage: enrichpkg [OPTIONS] COMMAND [ARGS]...

  init/test/install Enrich modules and access server

  Getting started:
     version:  Version of this sdk
     start:    First time instructions
     env:      Setup/check the setup

  Development:
     init:   Bootstrap modules including transforms*
     test:   Test transforms, manage datasets
     doodle: Access Doodle metadata server
     manage: Manage services such as mongo

  Server:
     api:       Access the server API

  Utils:
     sample:    Sample data for sharing

  Helpers:
     show-log:  Pretty print log output

  *Command used to be called bootstrap

Options:
  --help  Show this message and exit.