Installation
This python-based SDK is still in experimental mode. The distribution will be freely available once the SDK moves to general availability stage.
Supported Platforms→
- Ubuntu 20.04 or above (Other linux distributions should work as well but you have to hack)
- Python 3.8 or above only
- Cloud Platforms: AWS, GCP, and Azure (minimal)
- On-prem services: VMWare VSphere, RHEL 8.7+
Download→
EnrichSDK is distributed via the pypi service. First make sure that the OS dependencies are taken care of.
$ sudo apt-get update
$ sudo apt-get install python3.8-dev python3.8-venv # or any other verison of python3
$ workon scribble # this or next line
$ python3 -m venv venv
Then, install:
Contents→
The SDK is a typical python package. It has a growing set of modules. Check with Scribble on the support :
enrichsdk ├── app │ ├── bases │ │ ├── __init__.py │ │ ├── policyapp │ │ │ ├── forms.py │ │ │ ├── __init__.py │ │ │ ├── models.py │ │ │ └── views.py │ │ └── singlepageapp │ │ ├── forms.py │ │ ├── __init__.py │ │ ├── models.py │ │ └── views.py │ ├── __init__.py │ └── utils.py ├── contrib │ ├── __init__.py │ ├── lib │ │ ├── assets │ │ │ ├── anonymizer.py │ │ │ ├── changepoints.py │ │ │ ├── __init__.py │ │ │ ├── llmtextgen.py │ │ │ ├── profilespec.py │ │ │ └── syndata.py │ │ ├── catalog.py │ │ ├── __init__.py │ │ ├── logprocessor.py │ │ └── transforms │ │ ├── anomalies │ │ ├── changepoints │ │ ├── data_quality │ │ ├── feature_compute │ │ ├── filebased_query_executor │ │ │ ├── __init__.py │ │ │ └── lib.py │ │ ├── fileops │ │ ├── inmemory_query_executor │ │ ├── metrics │ │ ├── notebook_executor │ │ ├── observability │ │ └── synthetic_data_generator │ └── transforms │ ├── fileops │ ├── jsonsink │ ├── jsonsource │ ├── pqexport │ ├── sqlexport │ ├── tablesink │ └── tablesource ├── core │ ├── frames.py │ ├── __init__.py │ ├── mixins.py │ ├── node.py │ ├── patch.py │ ├── render.py │ ├── res.py │ ├── run.py │ ├── sdk.py │ ├── state.py │ └── widgets.py ├── datasets │ ├── discover.py │ ├── doodle.py │ ├── generators.py │ └── __init__.py ├── extractors ├── feature_compute ├── featurestore │ └── schema.py ├── lib │ ├── context.py │ ├── customer.py │ ├── exceptions.py │ ├── integration.py │ ├── misc.py │ ├── permissions.py │ ├── renderlib.py │ └── resources.py ├── quality │ ├── base.py │ ├── exceptions.py │ ├── expectations.py │ ├── __init__.py │ ├── reconciliation.py │ └── transforms.py ├── scripts │ ├── enrichpkg.py │ └── __init__.py ├── tasks │ ├── dummy_task │ │ └── __init__.py │ ├── __init__.py │ └── sdk.py ├── templates │ ├── airflow │ │ └── contrib-pipeline-v1.py │ ├── assets │ │ └── datasets.py │ ├── dashboard │ │ ├── apps.py │ │ ├── catalog.py │ │ ├── custom.py │ │ ├── __init__.py │ │ ├── persona.py │ │ ├── tasks.py │ │ ├── urls.py │ │ └── views.py │ ├── metrics │ │ ├── __init__.py │ │ └── profilespec.py │ ├── prefect │ │ └── default.py │ └── spark │ ├── dependencies │ │ ├── enrich.py │ │ ├── __init__.py │ │ ├── logging.py │ │ └── spark.py │ └── jobs │ └── run_spark.py └── utils ├── excel.py ├── __init__.py ├── redis.py └── sample.py
Check Installation→
After creating a virtual environment and installing the SDK, you will
be presented a enrich command called enrichpkg
. It supports several
actions:
$ enrichpkg
Usage: enrichpkg [OPTIONS] COMMAND [ARGS]...
init/test/install Enrich modules and access server
Getting started:
version: Version of this sdk
start: First time instructions
env: Setup/check the setup
Development:
init: Bootstrap modules including transforms*
test: Test transforms, manage datasets
doodle: Access Doodle metadata server
manage: Manage services such as mongo
Server:
api: Access the server API
Utils:
sample: Sample data for sharing
Helpers:
show-log: Pretty print log output
*Command used to be called bootstrap
Options:
--help Show this message and exit.