Installation
This python-based SDK is still in experimental mode. The distribution will be freely available once the SDK moves to general availability stage.
Supported Platforms→
- Ubuntu 20.04 or above (Other linux distributions should work as well but you have to hack)
- Python 3.8 or above only
- Cloud Platforms: AWS, GCP, and Azure (minimal)
- On-prem services: VMWare VSphere, RHEL 8.7+
Download→
EnrichSDK is distributed via the pypi service. First make sure that the OS dependencies are taken care of.
$ sudo apt-get update
$ sudo apt-get install python3.8-dev python3.8-venv # or any other verison of python3
$ workon scribble # this or next line
$ python3 -m venv venv
Then, install:
Contents→
The SDK is a typical python package. It has a growing set of modules. Check with Scribble on the support :
enrichsdk
├── app
│ ├── bases
│ │ ├── __init__.py
│ │ ├── policyapp
│ │ │ ├── forms.py
│ │ │ ├── __init__.py
│ │ │ ├── models.py
│ │ │ └── views.py
│ │ └── singlepageapp
│ │ ├── forms.py
│ │ ├── __init__.py
│ │ ├── models.py
│ │ └── views.py
│ ├── __init__.py
│ └── utils.py
├── contrib
│ ├── __init__.py
│ ├── lib
│ │ ├── assets
│ │ │ ├── anonymizer.py
│ │ │ ├── changepoints.py
│ │ │ ├── __init__.py
│ │ │ ├── llmtextgen.py
│ │ │ ├── profilespec.py
│ │ │ └── syndata.py
│ │ ├── catalog.py
│ │ ├── __init__.py
│ │ ├── logprocessor.py
│ │ └── transforms
│ │ ├── anomalies
│ │ ├── changepoints
│ │ ├── data_quality
│ │ ├── feature_compute
│ │ ├── filebased_query_executor
│ │ │ ├── __init__.py
│ │ │ └── lib.py
│ │ ├── fileops
│ │ ├── inmemory_query_executor
│ │ ├── metrics
│ │ ├── notebook_executor
│ │ ├── observability
│ │ └── synthetic_data_generator
│ └── transforms
│ ├── fileops
│ ├── jsonsink
│ ├── jsonsource
│ ├── pqexport
│ ├── sqlexport
│ ├── tablesink
│ └── tablesource
├── core
│ ├── frames.py
│ ├── __init__.py
│ ├── mixins.py
│ ├── node.py
│ ├── patch.py
│ ├── render.py
│ ├── res.py
│ ├── run.py
│ ├── sdk.py
│ ├── state.py
│ └── widgets.py
├── datasets
│ ├── discover.py
│ ├── doodle.py
│ ├── generators.py
│ └── __init__.py
├── extractors
├── feature_compute
├── featurestore
│ └── schema.py
├── lib
│ ├── context.py
│ ├── customer.py
│ ├── exceptions.py
│ ├── integration.py
│ ├── misc.py
│ ├── permissions.py
│ ├── renderlib.py
│ └── resources.py
├── quality
│ ├── base.py
│ ├── exceptions.py
│ ├── expectations.py
│ ├── __init__.py
│ ├── reconciliation.py
│ └── transforms.py
├── scripts
│ ├── enrichpkg.py
│ └── __init__.py
├── tasks
│ ├── dummy_task
│ │ └── __init__.py
│ ├── __init__.py
│ └── sdk.py
├── templates
│ ├── airflow
│ │ └── contrib-pipeline-v1.py
│ ├── assets
│ │ └── datasets.py
│ ├── dashboard
│ │ ├── apps.py
│ │ ├── catalog.py
│ │ ├── custom.py
│ │ ├── __init__.py
│ │ ├── persona.py
│ │ ├── tasks.py
│ │ ├── urls.py
│ │ └── views.py
│ ├── metrics
│ │ ├── __init__.py
│ │ └── profilespec.py
│ ├── prefect
│ │ └── default.py
│ └── spark
│ ├── dependencies
│ │ ├── enrich.py
│ │ ├── __init__.py
│ │ ├── logging.py
│ │ └── spark.py
│ └── jobs
│ └── run_spark.py
└── utils
├── excel.py
├── __init__.py
├── redis.py
└── sample.py
Check Installation→
After creating a virtual environment and installing the SDK, you will
be presented a enrich command called enrichpkg. It supports several
actions:
$ enrichpkg
Usage: enrichpkg [OPTIONS] COMMAND [ARGS]...
init/test/install Enrich modules and access server
Getting started:
version: Version of this sdk
start: First time instructions
env: Setup/check the setup
Development:
init: Bootstrap modules including transforms*
test: Test transforms, manage datasets
doodle: Access Doodle metadata server
manage: Manage services such as mongo
Server:
api: Access the server API
Utils:
sample: Sample data for sharing
Helpers:
show-log: Pretty print log output
*Command used to be called bootstrap
Options:
--help Show this message and exit.