Security & Upgrade
Scribble is committed to security of the data. It is achieved through a combination of industry best practices followed by Scribble and Enrich Platform, and also governed by the security policies of the client. Scribble operates transparently and in collaboration with the clients to achieve the security and privacy goals.
Deployment→
A few observations about the service:
- Private deployment. The Enrich data preparation platform operates behind the firewall of customers on machines that are not accessible form public internet.
- Restricted access. It is not a general purpose system that is accessible to end-customers or all employees of the client organization. The client determines who has access to data, how, and why. Typically the access is restricted to ML engineers.
- The service has a number of security enhancing features: (a) All accesses use SSL/https (b) All accesses require login except APi accesses via API Key. (c) All credentials are obfuscated or encrypted as needed on the server (d) Most user activity is logged (e) Downloads of data are encrypted by default, and can be disabled (f) Upgrades are allowed only by users with 'staff' permissions. Typically only one person at the client has staff permission. (g) Security enhacing data prep modules such as libraries and transforms. (h) Metadata and admin interface have elements to handle sensitive data
- Enrich has a compliance 'app' to help with compliance tasks outside Enrich itself.
-
Scribble is often tasked to generate the datasets as a service. Server is also upgraded regularly with new features due the fast moving space. As a result, Scribble usually has access to server with Enrich. As a security conscious company, we take a number of precautions to ensure security and compliance aspects:
- Only one staff member has access to the SSH keys - usually the CEO. Keys and other configuration files are stored automatically encrypted using PGP and stored in Github. They are unlocked only when machine access is required.
- Laptops of staff use encrypted filesystems
- Any access follows a protocol developed in conjunction with the client including the use of jump servers and over VPN.
- Data is often required for local testing of the data prep pipelines. A cap is placed on data transferred (say 0.01% of volume) and/or coordinated with the client. In most cases, we request anonymized data for the data prep. Most data prep doesnt require the PII information.
- The client, if required, can perform a full code review of the Enrich server at any point in time.
- All client-specific data preparation modules are completely separated from the Enrich codebase and stored in client's github repository only. None of the Enrich modules are client-specific.
Depending on the client, the details of the compliance vary. A very detailed compliance checklist can be found in the resources page of the server.
Sudo Access→
Service | Explanation | Specifics |
---|---|---|
OS Packages | OS/Library dependencies that need to be updated from time to time | Run apt-get, apt-update, apt-search, dpkg |
Nginx | Reverse proxy to the dashboard, metadata and other services | Start/stop/reload nginx Write access to /etc/nginx |
Supervisor | Various services (~ 10) including LLM microservices, dashboard | Start/stop/reload supervisord Write permissions to /var/run/supervisor.sock Write permissions to /etc/supervisor |
Redis | Store state, background tasks | start/stop redis |
Cron | Schedule workflows using crontab | Write to /var/spool/cron/crontabs/<username> |
$ sudo -l
(ALL) NOPASSWD: /usr/bin/systemctl start nginx
(ALL) NOPASSWD: /usr/bin/systemctl stop nginx
(ALL) NOPASSWD: /usr/bin/systemctl restart nginx
(ALL) NOPASSWD: /usr/bin/systemctl start supervisor
(ALL) NOPASSWD: /usr/bin/systemctl stop supervisor
(ALL) NOPASSWD: /usr/bin/systemctl restart supervisor
(ALL) NOPASSWD: /usr/bin/systemctl start redis
(ALL) NOPASSWD: /usr/bin/systemctl stop redis
(ALL) NOPASSWD: /usr/bin/systemctl restart redis
(ALL) NOPASSWD: /usr/sbin/service supervisor *
(ALL) NOPASSWD: /usr/sbin/service nginx *
(ALL) NOPASSWD: /usr/sbin/service redis *
(ALL) NOPASSWD: /usr/bin/apt-get *
(ALL) NOPASSWD: /usr/bin/dpkg *
(ALL) NOPASSWD: /usr/bin/vi /var/spool/cron/crontabs/<username>
(ALL) NOPASSWD: /usr/bin/crontab -u <username> -e
(ALL) NOPASSWD: /usr/bin/vi /etc/nginx/*
(ALL) NOPASSWD: /usr/bin/vi /etc/supervisor/*
(ALL) NOPASSWD: /usr/bin/cat /var/log/syslog*
(ALL) NOPASSWD: /usr/bin/cat /var/log/auth.log
(ALL) NOPASSWD: /usr/bin/cat /var/log/nginx/*
(ALL) NOPASSWD: /usr/bin/cat /var/log/supervisor/*
(ALL) NOPASSWD: /usr/bin/add-apt-repository *
(ALL) NOPASSWD: /usr/bin/pip3 *
# For agent registration. Admin can run it once and/or give permission
# to scribble
(ALL) NOPASSWD: /home/scrib/myagent/svc.sh
# supervisorctl need read/write access to work
$ sudo chmod o+rw /var/run/supervisor.sock
$ ls -l /var/run/supervisor.sock
srwxrw-rw- 1 root root 0 Aug 7 17:14 /var/run/supervisor.sock
# Readable python global libraries
$ sudo chmod -R o+r /usr/lib/python3.9
Credentials→
The nature of credentials required
Name | Explanation | Details |
---|---|---|
Storage | Ingest data/Backup | Read/write credentials to Sharepoint/Azureblob/S3 based on usecase |
DB | Ingest data/Write output | Read/write credentials to based on usecase |
API | Ingest data/Write output | Read/write credentials to based on usecase |
SMTP | Notifications | server, username/password, port, SSL |
Network Access→
Service | Explanation | Specifics |
---|---|---|
VPN | Access to privately deployed server | OpenVPN/other |
GUI | Scribble’s GUI running on the port | Port 443, Inbound |
SSH | Maintenance/upgrades | Port 22, Inbound, VPN only |
Pypi | Python package service | pypi,org , could be through proxy |
Github | Some custom python packages not in pypi Self-serve upgrade | github.com , could be through proxy |
Ubuntu | OS package distribution service | *.ubuntu.com |
Log Distribution→
Nature | Location | Notes |
---|---|---|
System Log | enrich/logs/app/app.json.log | JSON format, rotated regularly, most detailed |
Pipeline & Tasks | enrich/data/<owner>/<organization>/outputs/<name>/<run-id>/log.json | Pipeline and task-specific log for each run in JSON format |
Taskblade | enrich/llm-agents/logs/<name>/app.json.log | Microservice request-level log in JSON format |
Workflows | enrich/logs/workflows/*.log | One for each workflow (in cron) |
Doodle | enrich/logs/doodle/doodle.log | Metadata server log |
Dashboard | enrich/logs/gunicorn/gunicorn_supervisor.log | Minimal log. Useful mainly for exceptions |
Thirdparty | enrich/logs/netdata, enrich/logs/jupyterlab | Optional thirdparty services that are deployed |
System→
This is the most important log capturing the health of the system
Column | Details |
---|---|
logid | Unique ID for this log entry |
message | Main text |
levelname | Log level (DEBUG/ERROR etc) |
name | Name of the log ('app' is default) |
asctime | Timestamp |
funcName | Method from where the log has been generated |
lineno | Line of call |
path | Filename of the method |
module | Python module (same as path) |
created | Timestamp in seconds since epoch (1970-01-01) |
data | Extract application context to help with debugging |
exc_info | Stacktrace if there is an exception |
{"message": "Task result", "levelname": "DEBUG", "name": "app", "asctime": "2023-07-30 00:13:45,400", "funcName": "txnsearch_result", "lineno": 407, "pathname": "/home/scribble/enrich/customers/acme/Compliance/dashboard/compapp/views.py", "module": "views", "created": 1690656225.4009845, "data": "\ne3be563f-57fa-4240-924b-27b43f2e0e0b -> PENDING [{}]", "transform": "", "logid": "1690654664.21"}
{"message": "Task result", "levelname": "DEBUG", "name": "app", "asctime": "2023-07-30 00:13:48,102", "funcName": "txnsearch_result", "lineno": 407, "pathname": "/home/scribble/enrich/customers/acme/Compliance/dashboard/compapp/views.py", "module": "views", "created": 1690656228.1029518, "data": "\nf5c8794f-c67e-4cf9-8501-13199b9ffc43 -> SUCCESS [{'start_date': '2023-06-01', 'end_date': '2023-07-01', 'referrer': 'https://aip.acmeinc.com/dashboard/usecases/Compliance/applications/persona/details?persona=Customer+KYCs&table=Search&query=ALPHA1234', 'source': 'https://aip.acmeinc.com/dashboard/usecases/Compliance/applications/persona/details?persona=Customer+KYCs&table=Search&query=ALPHA1234', 'kyc_txn_ids': ['ALPHA1234'], 'name': 'txnsearch-Customer KYCs-Search-ALPHA1234-2023-07-30'}]", "transform": "", "logid": "1690654664.22"}
{"message": "Task result", "levelname": "DEBUG", "name": "app", "asctime": "2023-07-30 00:14:01,821", "funcName": "txnsearch_result", "lineno": 407, "pathname": "/home/scribble/enrich/customers/acme/Compliance/dashboard/compapp/views.py", "module": "views", "created": 1690656241.8213236, "data": "\ne3be563f-57fa-4240-924b-27b43f2e0e0b -> SUCCESS [{'start_date': '2023-06-01', 'end_date': '2023-07-01', 'referrer': 'https://aip.acmeinc.com/dashboard/usecases/Compliance/applications/persona/details?persona=Customer+KYCs&table=Search&query=BETA1234', 'source': 'https://aip.acmeinc.com/dashboard/usecases/Compliance/applications/persona/details?persona=Customer+KYCs&table=Search&query=BETA1234', 'kyc_txn_ids': ['BETA1234'], 'name': 'txnsearch-Customer KYCs-Search-BETA1234-2023-07-30'}]", "transform": "", "logid": "1690654664.27"}
{"message": "Task result", "levelname": "DEBUG", "name": "app", "asctime": "2023-07-30 00:14:03,359", "funcName": "txnsearch_result", "lineno": 407, "pathname": "/home/scribble/enrich/customers/acme/Compliance/dashboard/compapp/views.py", "module": "views", "created": 1690656243.3595865, "data": "\n7e284ef1-6080-441b-87cd-bdfa97c6ac9c -> PENDING [{}]", "transform": "", "logid": "1690654664.28"}
{"message": "Task result", "levelname": "DEBUG", "name": "app", "asctime": "2023-07-30 00:14:06,662", "funcName": "txnsearch_result", "lineno": 407, "pathname": "/home/scribble/enrich/customers/acme/Compliance/dashboard/compapp/views.py", "module": "views", "created": 1690656246.6628153, "data": "\nbd23220e-350b-49f4-97f2-5334d0f1a1d5 -> SUCCESS [{'start_date': '2023-06-01', 'end_date': '2023-07-01', 'referrer': 'https://aip.acmeinc.com/dashboard/usecases/Compliance/applications/persona/details?persona=Customer+KYCs&table=Search&query=GAMMA1234', 'source': 'https://aip.acmeinc.com/dashboard/usecases/Compliance/applications/persona/details?persona=Customer+KYCs&table=Search&query=GAMMA1234', 'kyc_txn_ids': ['GAMMA1234'], 'name': 'txnsearch-Customer KYCs-Search-GAMMA1234-2023-07-30'}]", "transform": "", "logid": "1690654664.29"}
Pipelines & Tasks→
The pipelines also use the 'app' logger configuration. In addition to the columns mentioned above, we have columns that help understand the performance and correctness of the computation.
Column | Details |
---|---|
customer/usecase | Usecase Group to which this pipeline belongs |
transform | Transformation module that is the source of this log entry |
conf | Pipeline filename |
runid | Run to which this log belongs |
application | Name of this pipeline/task |
{"asctime": "2023-03-09T12:04:48+0530", "name": "app", "levelname": "DEBUG", "message": "FileOperations - process", "transform": "FileOperations", "runid": "daily-20230309-120433", "conf": "test.py", "usecase": "Contrib", "customer": "Contrib", "application": "TestPy", "ts": "2023-03-09T12:04:48+05:30", "data": "", "logid": 125}
{"asctime": "2023-03-09T12:04:48+0530", "name": "app", "levelname": "DEBUG", "message": "FileOperations - Completed", "transform": "FileOperations", "data": "\nCopy: scribble/Contrib/output/TestPy/daily-20230309-120433/cars1.csv => scribble/Contrib/shared/cars/2022-11-02/cars.csv\nCopy: scribble/Contrib/output/TestPy/daily-20230309-120433/hello.json => scribble/Contrib/shared/cars/2022-11-02/hello.json\nCopy: scribble/Contrib/output/TestPy/daily-20230309-120433/searchmeta.json => scribble/Contrib/shared/cars/2022-11-02/searchmeta.json\nCopy: scribble/Contrib/output/TestPy/daily-20230309-120433/viz/cars1.pickle => scribble/Contrib/shared/cars/2022-11-02/cars.pickle\nCopy: scribble/Contrib/output/TestPy/daily-20230309-120433/viz/cars1.pickle => scribble/Contrib/shared/cars/cars.pickle\n", "runid": "daily-20230309-120433", "conf": "test.py", "usecase": "Contrib", "customer": "Contrib", "application": "TestPy", "ts": "2023-03-09T12:04:48+05:30", "logid": 126}
{"asctime": "2023-03-09T12:04:48+0530", "name": "app", "levelname": "DEBUG", "message": "Process completed", "transform": "FileOperations", "runid": "daily-20230309-120433", "conf": "test.py", "usecase": "Contrib", "customer": "Contrib", "application": "TestPy", "ts": "2023-03-09T12:04:48+05:30", "data": "", "logid": 127}
{"asctime": "2023-03-09T12:04:48+0530", "name": "app", "levelname": "DEBUG", "message": "Validated results", "transform": "FileOperations", "runid": "daily-20230309-120433", "conf": "test.py", "usecase": "Contrib", "customer": "Contrib", "application": "TestPy", "ts": "2023-03-09T12:04:48+05:30", "data": "", "logid": 128}
TaskBlade Audit→
The taskblades also use the 'app' logger configuration. In addition to the columns mentioned above, we have additional columns to allow us to trace the request through entire system.
Column | Details |
---|---|
Source | Source of log entry. Could be a pipeline or a (micro)service |
request_id | UUID for each customer request |
dataset | Dataset being queried |
username | User who triggered the query |
{"message": "[datagpt] Returning existing instance", "levelname": "DEBUG", "name": "app", "asctime": "2023-07-28 16:54:53,016", "funcName": "get_agent_details", "lineno": 351, "pathname": "/home/pingali/Code/scribble-llmsdk/llmsdk/services/lib.py", "module": "lib", "created": 1690543493.016718, "source": "service", "user": "venkata", "dataset": "acme-retail"}
{"message": "Query Status: pending", "levelname": "DEBUG", "name": "app", "asctime": "2023-07-28 16:54:53,351", "funcName": "qna_status", "lineno": 301, "pathname": "/home/pingali/Code/scribble-llmsdk/llmsdk/services/datagpt.py", "module": "datagpt", "created": 1690543493.3518183, "source": "service", "request_id": "7386fcfe-273e-4a1d-80e8-8b1848114362", "dataset": "acme-retail", "user": "venkata", "data": "{\n \"query\": \"how many rows are there?\",\n \"status\": \"pending\",\n \"user\": \"venkata\",\n \"dataset\": \"acme-retail\",\n \"params\": {\n \"user\": \"venkata\",\n \"dataset\": \"acme-retail\",\n \"context\": \"\",\n \"namespace\": \"datagpt\",\n \"query\": \"how many rows are there?\",\n \"policy\": {\n \"schema\": \"v1\",\n \"policies\": [],\n \"runtime\": {\n \"clear_agent_memory\": false\n }\n },\n \"mode\": \"economy\"\n }\n}"}
{"message": "Updated Result: success", "levelname": "DEBUG", "name": "app", "asctime": "2023-07-28 16:54:58,399", "funcName": "query_update_result", "lineno": 724, "pathname": "/home/pingali/Code/scribble-llmsdk/llmsdk/services/lib.py", "module": "lib", "created": 1690543498.3995056, "source": "service", "request_id": "7386fcfe-273e-4a1d-80e8-8b1848114362", "user": "venkata", "dataset": "acme-retail", "data": "{\n \"status\": \"success\",\n \"result\": {\n \"intermediate_steps\": [\n [\n \"AgentAction(tool='sql_db_list_tables', tool_input='', log='Action: sql_db_list_tables\\\\nAction Input: ')\",\n \"sales\"\n ],\n [\n \"AgentAction(tool='sql_db_schema', tool_input='sales', log='The only table in the database is \\\"sales\\\". I should query the schema of the \\\"sales\\\" table to see the structure of the data.\\\\nAction: sql_db_schema\\\\nAction Input: sales')\",\n \"\\nCREATE TABLE sales (\\n\\t\\\"index\\\" INTEGER, \\n\\tinvoice_no TEXT, \\n\\tstock_code TEXT, \\n\\tdescription TEXT, \\n\\tquantity INTEGER, \\n\\tunit_price REAL, \\n\\tcustomer_id INTEGER, \\n\\tcountry TEXT, \\n\\tsales REAL, \\n\\tinvoice_day INTEGER, \\n\\tinvoice_month INTEGER, \\n\\tinvoice_year INTEGER\\n)\\n\\n/*\\n3 rows from sales table:\\nindex\\tinvoice_no\\tstock_code\\tdescription\\tquantity\\tunit_price\\tcustomer_id\\tcountry\\tsales\\tinvoice_day\\tinvoice_month\\tinvoice_year\\n0\\t536365\\t85123A\\tWHITE HANGING HEART T-LIGHT HOLDER\\t6\\t2.55\\t17850\\tUnited Kingdom\\t15.299999999999999\\t1\\t12\\t2010\\n1\\t536365\\t71053\\tWHITE METAL LANTERN\\t6\\t3.39\\t17850\\tUnited Kingdom\\t20.34\\t1\\t12\\t2010\\n2\\t536365\\t84406B\\tCREAM CUPID HEARTS COAT HANGER\\t8\\t2.75\\t17850\\tUnited Kingdom\\t22.0\\t1\\t12\\t2010\\n*/\"\n ]\n ],\n \"cascade\": {\n \"id\": \"economy\",\n \"platform\": \"openai\",\n \"model\": \"gpt-3.5-turbo\"\n },\n \"success\": true,\n \"tries\": [\n {\n \"seq\": 0,\n \"cascade_id\": \"economy\",\n \"success\": true\n }\n ],\n \"query\": \"how many rows are there?\",\n \"answer\": 3,\n \"type\": \"json\",\n \"raw_thoughts\": [\n \"\",\n \"\",\n \"> Entering new chain...\",\n \"Action: sql_db_list_tables\",\n \"Action Input: \",\n \"Observation: sales\",\n \"Thought:The only table in the database is \\\"sales\\\". I should query the schema of the \\\"sales\\\" table to see the structure of the data.\",\n \"Action: sql_db_schema\",\n \"Action Input: sales\",\n \"Observation: \",\n \"CREATE TABLE sales (\",\n \"\\t\\\"index\\\" INTEGER, \",\n \"\\tinvoice_no TEXT, \",\n \"\\tstock_code TEXT, \",\n \"\\tdescription TEXT, \",\n \"\\tquantity INTEGER, \",\n \"\\tunit_price REAL, \",\n \"\\tcustomer_id INTEGER, \",\n \"\\tcountry TEXT, \",\n \"\\tsales REAL, \",\n \"\\tinvoice_day INTEGER, \",\n \"\\tinvoice_month INTEGER, \",\n \"\\tinvoice_year INTEGER\",\n \")\",\n \"\",\n \"/*\",\n \"3 rows from sales table:\",\n \"index\\tinvoice_no\\tstock_code\\tdescription\\tquantity\\tunit_price\\tcustomer_id\\tcountry\\tsales\\tinvoice_day\\tinvoice_month\\tinvoice_year\",\n \"0\\t536365\\t85123A\\tWHITE HANGING HEART T-LIGHT HOLDER\\t6\\t2.55\\t17850\\tUnited Kingdom\\t15.299999999999999\\t1\\t12\\t2010\",\n \"1\\t536365\\t71053\\tWHITE METAL LANTERN\\t6\\t3.39\\t17850\\tUnited Kingdom\\t20.34\\t1\\t12\\t2010\",\n \"2\\t536365\\t84406B\\tCREAM CUPID HEARTS COAT HANGER\\t8\\t2.75\\t17850\\tUnited Kingdom\\t22.0\\t1\\t12\\t2010\",\n \"*/\",\n \"Thought:There are 3 rows in the \\\"sales\\\" table. \",\n \"Final Answer: 3\",\n \"\",\n \"> Finished chain.\",\n \"\"\n ],\n \"chain_of_thought\": [\n {\n \"thought\": \"BEGIN\",\n \"tool\": \"sql_db_list_tables\",\n \"tool_input\": \"\",\n \"observation\": \"sales\"\n },\n {\n \"thought\": \"The only table in the database is \\\"sales\\\". I should query the schema of the \\\"sales\\\" table to see the structure of the data.\",\n \"tool\": \"sql_db_schema\",\n \"tool_input\": \"sales\",\n \"observation\": \"\\nCREATE TABLE sales (\\n\\t\\\"index\\\" INTEGER, \\n\\tinvoice_no TEXT, \\n\\tstock_code TEXT, \\n\\tdescription TEXT, \\n\\tquantity INTEGER, \\n\\tunit_price REAL, \\n\\tcustomer_id INTEGER, \\n\\tcountry TEXT, \\n\\tsales REAL, \\n\\tinvoice_day INTEGER, \\n\\tinvoice_month INTEGER, \\n\\tinvoice_year INTEGER\\n)\\n\\n/*\\n3 rows from sales table:\\nindex\\tinvoice_no\\tstock_code\\tdescription\\tquantity\\tunit_price\\tcustomer_id\\tcountry\\tsales\\tinvoice_day\\tinvoice_month\\tinvoice_year\\n0\\t536365\\t85123A\\tWHITE HANGING HEART T-LIGHT HOLDER\\t6\\t2.55\\t17850\\tUnited Kingdom\\t15.299999999999999\\t1\\t12\\t2010\\n1\\t536365\\t71053\\tWHITE METAL LANTERN\\t6\\t3.39\\t17850\\tUnited Kingdom\\t20.34\\t1\\t12\\t2010\\n2\\t536365\\t84406B\\tCREAM CUPID HEARTS COAT HANGER\\t8\\t2.75\\t17850\\tUnited Kingdom\\t22.0\\t1\\t12\\t2010\\n*/\"\n }\n ],\n \"code\": {\n \"dialect\": \"sql\",\n \"snippets\": []\n },\n \"metadata\": {\n \"name\": \"Acme Retail\",\n \"description\": \"E-Commerce transaction data\",\n \"url\": \"https://www.kaggle.com/datasets/carrie1/ecommerce-data\",\n \"files\": [\n {\n \"path\": \"sales.sqlite\",\n \"url\": \"sales.sqlite\",\n \"tables\": [\n {\n \"name\": \"sales\",\n \"desc\": \" Transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail\",\n \"context\": [\n \"each row in the table has sales for one product per invoice\",\n \"there are multiple rows for each invoice\",\n \"date column used month/day/year format\"\n ],\n \"cols\": [\n {\n \"invoice_no\": \"Number of the transaction\"\n },\n {\n \"stock_code\": \"SKU of the product\"\n },\n {\n \"description\": \"Description of the product\"\n },\n {\n \"quantity\": \"Number of units of product\"\n },\n {\n \"invoice_day\": \"Day of the transaction\"\n },\n {\n \"invoice_month\": \"Month of the transaction\"\n },\n {\n \"invoice_year\": \"Year of the transaction\"\n },\n {\n \"unit_price\": \"Price of one unit of product\"\n },\n {\n \"customer_id\": \"Customer who bought the product\"\n }\n ]\n }\n ]\n }\n ]\n }\n }\n}"}
{"message": "Query Status: success", "levelname": "DEBUG", "name": "app", "asctime": "2023-07-28 16:55:08,690", "funcName": "qna_status", "lineno": 301, "pathname": "/home/pingali/Code/scribble-llmsdk/llmsdk/services/datagpt.py", "module": "datagpt", "created": 1690543508.6908808, "source": "service", "request_id": "7386fcfe-273e-4a1d-80e8-8b1848114362", "dataset": "acme-retail", "user": "venkata", "data": "{\n \"query\": \"how many rows are there?\",\n \"status\": \"success\",\n \"user\": \"venkata\",\n \"dataset\": \"acme-retail\",\n \"params\": {\n \"user\": \"venkata\",\n \"dataset\": \"acme-retail\",\n \"context\": \"\",\n \"namespace\": \"datagpt\",\n \"query\": \"how many rows are there?\",\n \"policy\": {\n \"schema\": \"v1\",\n \"policies\": [],\n \"runtime\": {\n \"clear_agent_memory\": false\n }\n },\n \"mode\": \"economy\"\n },\n \"result\": {\n \"intermediate_steps\": [\n [\n \"AgentAction(tool='sql_db_list_tables', tool_input='', log='Action: sql_db_list_tables\\\\nAction Input: ')\",\n \"sales\"\n ],\n [\n \"AgentAction(tool='sql_db_schema', tool_input='sales', log='The only table in the database is \\\"sales\\\". I should query the schema of the \\\"sales\\\" table to see the structure of the data.\\\\nAction: sql_db_schema\\\\nAction Input: sales')\",\n \"\\nCREATE TABLE sales (\\n\\t\\\"index\\\" INTEGER, \\n\\tinvoice_no TEXT, \\n\\tstock_code TEXT, \\n\\tdescription TEXT, \\n\\tquantity INTEGER, \\n\\tunit_price REAL, \\n\\tcustomer_id INTEGER, \\n\\tcountry TEXT, \\n\\tsales REAL, \\n\\tinvoice_day INTEGER, \\n\\tinvoice_month INTEGER, \\n\\tinvoice_year INTEGER\\n)\\n\\n/*\\n3 rows from sales table:\\nindex\\tinvoice_no\\tstock_code\\tdescription\\tquantity\\tunit_price\\tcustomer_id\\tcountry\\tsales\\tinvoice_day\\tinvoice_month\\tinvoice_year\\n0\\t536365\\t85123A\\tWHITE HANGING HEART T-LIGHT HOLDER\\t6\\t2.55\\t17850\\tUnited Kingdom\\t15.299999999999999\\t1\\t12\\t2010\\n1\\t536365\\t71053\\tWHITE METAL LANTERN\\t6\\t3.39\\t17850\\tUnited Kingdom\\t20.34\\t1\\t12\\t2010\\n2\\t536365\\t84406B\\tCREAM CUPID HEARTS COAT HANGER\\t8\\t2.75\\t17850\\tUnited Kingdom\\t22.0\\t1\\t12\\t2010\\n*/\"\n ]\n ],\n \"cascade\": {\n \"id\": \"economy\",\n \"platform\": \"openai\",\n \"model\": \"gpt-3.5-turbo\"\n },\n \"success\": true,\n \"tries\": [\n {\n \"seq\": 0,\n \"cascade_id\": \"economy\",\n \"success\": true\n }\n ],\n \"query\": \"how many rows are there?\",\n \"answer\": 3,\n \"type\": \"json\",\n \"raw_thoughts\": [\n \"\",\n \"\",\n \"> Entering new chain...\",\n \"Action: sql_db_list_tables\",\n \"Action Input: \",\n \"Observation: sales\",\n \"Thought:The only table in the database is \\\"sales\\\". I should query the schema of the \\\"sales\\\" table to see the structure of the data.\",\n \"Action: sql_db_schema\",\n \"Action Input: sales\",\n \"Observation: \",\n \"CREATE TABLE sales (\",\n \"\\t\\\"index\\\" INTEGER, \",\n \"\\tinvoice_no TEXT, \",\n \"\\tstock_code TEXT, \",\n \"\\tdescription TEXT, \",\n \"\\tquantity INTEGER, \",\n \"\\tunit_price REAL, \",\n \"\\tcustomer_id INTEGER, \",\n \"\\tcountry TEXT, \",\n \"\\tsales REAL, \",\n \"\\tinvoice_day INTEGER, \",\n \"\\tinvoice_month INTEGER, \",\n \"\\tinvoice_year INTEGER\",\n \")\",\n \"\",\n \"/*\",\n \"3 rows from sales table:\",\n \"index\\tinvoice_no\\tstock_code\\tdescription\\tquantity\\tunit_price\\tcustomer_id\\tcountry\\tsales\\tinvoice_day\\tinvoice_month\\tinvoice_year\",\n \"0\\t536365\\t85123A\\tWHITE HANGING HEART T-LIGHT HOLDER\\t6\\t2.55\\t17850\\tUnited Kingdom\\t15.299999999999999\\t1\\t12\\t2010\",\n \"1\\t536365\\t71053\\tWHITE METAL LANTERN\\t6\\t3.39\\t17850\\tUnited Kingdom\\t20.34\\t1\\t12\\t2010\",\n \"2\\t536365\\t84406B\\tCREAM CUPID HEARTS COAT HANGER\\t8\\t2.75\\t17850\\tUnited Kingdom\\t22.0\\t1\\t12\\t2010\",\n \"*/\",\n \"Thought:There are 3 rows in the \\\"sales\\\" table. \",\n \"Final Answer: 3\",\n \"\",\n \"> Finished chain.\",\n \"\"\n ],\n \"chain_of_thought\": [\n {\n \"thought\": \"BEGIN\",\n \"tool\": \"sql_db_list_tables\",\n \"tool_input\": \"\",\n \"observation\": \"sales\"\n },\n {\n \"thought\": \"The only table in the database is \\\"sales\\\". I should query the schema of the \\\"sales\\\" table to see the structure of the data.\",\n \"tool\": \"sql_db_schema\",\n \"tool_input\": \"sales\",\n \"observation\": \"\\nCREATE TABLE sales (\\n\\t\\\"index\\\" INTEGER, \\n\\tinvoice_no TEXT, \\n\\tstock_code TEXT, \\n\\tdescription TEXT, \\n\\tquantity INTEGER, \\n\\tunit_price REAL, \\n\\tcustomer_id INTEGER, \\n\\tcountry TEXT, \\n\\tsales REAL, \\n\\tinvoice_day INTEGER, \\n\\tinvoice_month INTEGER, \\n\\tinvoice_year INTEGER\\n)\\n\\n/*\\n3 rows from sales table:\\nindex\\tinvoice_no\\tstock_code\\tdescription\\tquantity\\tunit_price\\tcustomer_id\\tcountry\\tsales\\tinvoice_day\\tinvoice_month\\tinvoice_year\\n0\\t536365\\t85123A\\tWHITE HANGING HEART T-LIGHT HOLDER\\t6\\t2.55\\t17850\\tUnited Kingdom\\t15.299999999999999\\t1\\t12\\t2010\\n1\\t536365\\t71053\\tWHITE METAL LANTERN\\t6\\t3.39\\t17850\\tUnited Kingdom\\t20.34\\t1\\t12\\t2010\\n2\\t536365\\t84406B\\tCREAM CUPID HEARTS COAT HANGER\\t8\\t2.75\\t17850\\tUnited Kingdom\\t22.0\\t1\\t12\\t2010\\n*/\"\n }\n ],\n \"code\": {\n \"dialect\": \"sql\",\n \"snippets\": []\n },\n \"metadata\": {\n \"name\": \"Acme Retail\",\n \"description\": \"E-Commerce transaction data\",\n \"url\": \"https://www.kaggle.com/datasets/carrie1/ecommerce-data\",\n \"files\": [\n {\n \"path\": \"sales.sqlite\",\n \"url\": \"sales.sqlite\",\n \"tables\": [\n {\n \"name\": \"sales\",\n \"desc\": \" Transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail\",\n \"context\": [\n \"each row in the table has sales for one product per invoice\",\n \"there are multiple rows for each invoice\",\n \"date column used month/day/year format\"\n ],\n \"cols\": [\n {\n \"invoice_no\": \"Number of the transaction\"\n },\n {\n \"stock_code\": \"SKU of the product\"\n },\n {\n \"description\": \"Description of the product\"\n },\n {\n \"quantity\": \"Number of units of product\"\n },\n {\n \"invoice_day\": \"Day of the transaction\"\n },\n {\n \"invoice_month\": \"Month of the transaction\"\n },\n {\n \"invoice_year\": \"Year of the transaction\"\n },\n {\n \"unit_price\": \"Price of one unit of product\"\n },\n {\n \"customer_id\": \"Customer who bought the product\"\n }\n ]\n }\n ]\n }\n ]\n }\n }\n}"}
Workflow→
The scheduler (cron) triggers a Prefect workflow. The prefect workflow triggers batch jobs using a DAG embedded in the workflow. The log name is typically <workflow_name><timestamp>.log
The typical format is
[Timestamp] Level - Method | Log-text
This is default format from Prefect (slightly older version).
[2022-12-08 08:12:53+0530] DEBUG - prefect.run_log_dumper | Error: b''
[2022-12-08 08:12:53+0530] DEBUG - prefect.run_log_dumper | Completed: Log resouce dumper
[2022-12-08 08:12:53+0530] DEBUG - prefect.TaskRunner | Task 'run_log_dumper': Handling state change from Running to Success
[2022-12-08 08:12:53+0530] INFO - prefect.TaskRunner | Task 'run_log_dumper': Finished task run for task with final state: 'Success'
[2022-12-08 08:12:53+0530] INFO - prefect.TaskRunner | Task 'run_routeopt': Starting task run...
[2022-12-08 08:12:53+0530] DEBUG - prefect.TaskRunner | Task 'run_routeopt': Handling state change from Pending to Running
[2022-12-08 08:12:53+0530] DEBUG - prefect.TaskRunner | Task 'run_routeopt': Calling task.run() method...
[2022-12-08 08:12:53+0530] DEBUG - prefect.run_routeopt | Starting: route optimizer
[2022-12-08 08:13:19+0530] DEBUG - prefect.run_routeopt | Output: b'Successfully read secure siteconf\nOverriding credentials from sitecred\nSuccessfully read secure siteconf\n....warnings /app/scribble/enrich/.virtualenvs/scribble3.8/lib/python3.8/site-packages/statsmodels/tsa/base/tsa_model.py:7: FutureWarning: pandas.Float64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n from pandas import (to_datetime, Int64Index, DatetimeIndex, Period,\n\nEXIT CODE: 0\n'
[2022-12-08 08:13:19+0530] DEBUG - prefect.run_routeopt | Error: b''
[2022-12-08 08:13:19+0530] DEBUG - prefect.run_routeopt | Completed: route optimizer
[2022-12-08 08:13:19+0530] DEBUG - prefect.TaskRunner | Task 'run_routeopt': Handling state change from Running to Success
Background Tasks→
We use celery to run background tasks. The log has the format:
[Timestamp: Level/ProcessName] TaskName [TaskID]:Module:FunctionName: Message
This is used for heavier tasks like running long running queries that may be triggered from the GUI. Most of that is usually done by the pipelines but sometimes we need this capability.
[2023-07-06 11:21:29,357: INFO/MainProcess] Task search_timeline[9080c216-f7ac-4672-8678-fd3d8f598bcf] received
[2023-07-06 11:21:29,357: DEBUG/MainProcess] TaskPool: Apply <function fast_trace_task at 0x7f55e1ada820> (args:('search_timeline', '9080c216-f7ac-4672-8678-fd3d8f598bcf', {'lang': 'py', 'task': 'search_timeline', 'id': '9080c216-f7ac-4672-8678-fd3d8f598bcf', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': '9080c216-f7ac-4672-8678-fd3d8f598bcf', 'parent_id': None, 'argsrepr': "({'txnids': ['TPWU000058115004'], 'nontxnids': [], 'start_date': '2023-06-30', 'end_date': '2023-07-03', 'limit': 10000, 'goal': 'Select', 'importance': 'Select'},)", 'kwargsrepr': '{}', 'origin': 'gen18892@aip.acmeinc.com', 'ignore_result': False, 'sentry-trace': '81feea2325324df384a04b81339c208c-bf79307d4d2d2c39-0', 'baggage': 'sentry-trace_id=81feea2325324df384a04b81339c208c,sentry-environment=production,sentry-public_key=e085c5e165b6441c92f29beacd0be47e,sentry-transaction=/dashboard/usecases/Operations/applications/timeline/%5B/%5D,sentry-sample_rate=0.0', 'headers': {'sentry-trace': '81feea2325324df384a04b81339c208c-bf79307d4d2d2c39-0', 'baggage':... kwargs:{})
[2023-07-06 11:29:32,723: INFO/ForkPoolWorker-4] Task search_timeline[9080c216-f7ac-4672-8678-fd3d8f598bcf] succeeded in 483.36513194441795s: {'log': ['[OK] enrich-acme/preprocessed/aip.acmeinc.com/logs/eig_in_classifier/v1/2023/06/30/transactions_TP_4.csv.gz', '[OK] enrich-acme/preprocessed/aip.acmeinc.com/logs/eig_out_classifier/v1/2023/06/30/transactions_TP_4.csv.gz', '[OK] enrich-acme/preprocessed/aip.acmeinc.com/logs/terra_core_classifier/v1/2023/06/30/transactions_TP_4.csv.gz', '[OK] enrich-acme/preprocessed/aip.acmeinc.com/logs/eig_in_classifier/v1/2023/07/01/transactions_TP_4.csv.gz', '[OK] enrich-acme/preprocessed/aip.acmeinc.com/logs/eig_out_classifier/v1/2023/07/01/transactions_TP_4.csv.gz', '[OK] enrich-acme/preprocessed/aip.acmeinc.com/logs/terra_core_classifier/v1/2023/07/01/transactions_TP_4.csv.gz', '[OK] enrich-acme/preprocessed/aip.acmeinc.com/logs/eig_in_classifier/v1/2023/07/02/transactions_TP_4.csv.gz', '[OK] enrich-acme/preprocessed/aip.acmeinc.com/logs/eig_out_classifier/v1/2023/07/02/transactions_TP_4.csv.gz', '[OK]...', '[OK]...', '[OK]...', '[OK]...'], 'files': 12, 'processed': 21917429, 'matched': , ...}
[2023-07-06 12:07:36,774: INFO/MainProcess] Task search_transactions[320bbda6-e758-49fc-8a26-e8dc29f88f75] received
[2023-07-06 12:07:36,774: DEBUG/MainProcess] TaskPool: Apply <function fast_trace_task at 0x7f55e1ada820> (args:('search_transactions', '320bbda6-e758-49fc-8a26-e8dc29f88f75', {'lang': 'py', 'task': 'search_transactions', 'id': '320bbda6-e758-49fc-8a26-e8dc29f88f75', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': '320bbda6-e758-49fc-8a26-e8dc29f88f75', 'parent_id': None, 'argsrepr': "({'start_date': '2023-04-06', 'end_date': '2023-07-06', 'referrer': 'https://aip.acmeinc.com/dashboard/usecases/Compliance/applications/persona/details?persona=Customer+ACCs&table=Search&query=GAMMA1234', 'source': 'https://aip.acmeinc.com/dashboard/usecases/Compliance/applications/persona/details?persona=Customer+ACCs&table=Search&query=GAMMA1234', 'acc_txn_ids': ['GAMMA1234'], 'name': 'txnsearch-Customer ACCs-Search-GAMMA1234-2023-07-06'},)", 'kwargsrepr': '{}', 'origin': 'gen18892@aip.acmeinc.com', 'ignore_result': False, 'sentry-trace': 'ad1f51cac0c84c9ab2a00f5e0bced5b6-82c797b46c157901-0', 'baggage':... kwargs:{})
[2023-07-06 12:07:36,776: INFO/ForkPoolWorker-1] search_transactions[320bbda6-e758-49fc-8a26-e8dc29f88f75]: [0] Chunk running SQL: SELECT *
FROM transactions
where ((DATE(modified_on) >= '2023-04-06') AND
(DATE(modified_on) <= '2023-07-06') AND
((sender_id_no IN ('GAMMA1234')) OR
(receiver_id_no IN ('GAMMA1234'))))
[2023-07-06 12:07:36,826: INFO/ForkPoolWorker-1] search_transactions[320bbda6-e758-49fc-8a26-e8dc29f88f75]: [0] CHunk Received 3 records
[2023-07-06 12:07:36,834: INFO/ForkPoolWorker-1] Task search_transactions[320bbda6-e758-49fc-8a26-e8dc29f88f75] succeeded in 0.05865555256605148s: {'records': [['hub_transaction_id', 'qrn', 'mod_id', 'response_code', 'response_message', 'status', 'transaction_date_time_local', 'transaction_date_time_global', 'execution_time', 'source_country_code', 'destination_country_code', 'source_currency_code', 'destination_currency_code', 'quote_time', 'source_sink_type', 'destination_sink_type', 'sender_service_charge', 'receiver_service_charge', 'hub_sender_srvc_chrg', 'hub_receiver_srvc_chrg', 'total_service_charge', 'sender_tax_amount', 'hub_tax_amount', 'reciever_tax_amount', 'exchange_rate', 'total_amount_source', 'total_amount_destination', 'qoute_expire_on', 'transaction_time', 'source_partner_id', 'dest_partner_id', 'quote_response_code', 'quote_response_message', 'sending_party_mob', 'sender_acc_id_type', 'receiving_party_mob', 'receiver_name', 'sender_acc_id_no', 'receiver_message', 'custom_field1', 'custom_field2', 'custom_field3', 'receiver_acc_id_type', 'receiver_acc_id_no', 'credit_transaction_id', 'scnumber', 'quote_status', 'sender_name', 'src_txn_...', ...]]}
[2023-07-06 12:12:32,617: INFO/MainProcess] Task search_transactions[4cc44896-3698-4699-913b-06267909ec9f] received
[2023-07-06 12:12:32,618: DEBUG/MainProcess] TaskPool: Apply <function fast_trace_task at 0x7f55e1ada820> (args:('search_transactions', '4cc44896-3698-4699-913b-06267909ec9f', {'lang': 'py', 'task': 'search_transactions', 'id': '4cc44896-3698-4699-913b-06267909ec9f', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': '4cc44896-3698-4699-913b-06267909ec9f', 'parent_id': None, 'argsrepr': "({'start_date': '2023-04-06', 'end_date': '2023-07-06', 'referrer': 'https://aip.acmeinc.com/dashboard/usecases/Compliance/applications/persona/details?persona=Customer+ACCs&table=Search&query=GAMMA1234', 'source': 'https://aip.acmeinc.com/dashboard/usecases/Compliance/applications/persona/details?persona=Customer+ACCs&table=Search&query=GAMMA1234', 'acc_txn_ids': ['GAMMA1234'], 'name': 'txnsearch-Customer ACCs-Search-GAMMA1234-2023-07-06'},)", 'kwargsrepr': '{}', 'origin': 'gen18892@aip.acmeinc.com', 'ignore_result': False, 'sentry-trace': '8d34fb1669954bd7ae2aece3f3e718ee-82f89c3a73728c8b-0', 'baggage':... kwargs:{})
[2023-07-06 12:12:32,620: INFO/ForkPoolWorker-7] search_transactions[4cc44896-3698-4699-913b-06267909ec9f]: [0] Chunk running SQL: SELECT *
FROM iox_hub_transaction
where ((DATE(modified_on) >= '2023-04-06') AND
(DATE(modified_on) <= '2023-07-06') AND
Doodle→
The metadata server is called Doodle. It is a typical django server with REST API. It has multiple functions including tracking catalog, pipeline performance, and programmatic uses of computed data.
The log has the format
[Timestamp] [ProcessID] [Level] Method URL
The UUID in the URL is usually a metadata entity id (e.g., a table)
[2023-04-06 02:25:01 +0000] [31917] [DEBUG] POST /metadata/api/v1/features/1cae096e-1b0d-4cac-bc5c-dd2e811908ad
[2023-04-06 02:25:01 +0000] [31917] [DEBUG] POST /metadata/api/v1/features/6d7f21ec-86ef-4349-b261-3b05f244cd02
[2023-04-06 02:25:01 +0000] [31917] [DEBUG] POST /metadata/api/v1/sources/dcae5e91-3208-42d8-9778-d95ad9a5c02c
[2023-04-06 02:25:01 +0000] [31917] [DEBUG] GET /metadata/api/v1/features
[2023-04-06 02:25:01 +0000] [31917] [DEBUG] POST /metadata/api/v1/features/f4cb8c99-67a9-441e-93a5-deb1b2521c06
Dashboard Server→
Most of the application activity is captured in the system log. When the exceptions miss the system log such as errors at the time of loading, unhandled exceptions, and basic accesses.
[2023-07-20 10:27:01 +0530] [2501] [DEBUG] GET /accounts/login/
[2023-07-20 10:27:20 +0530] [2501] [DEBUG] GET /accounts/login/
[2023-07-20 10:27:31 +0530] [2501] [DEBUG] GET /accounts/login/
Other Services→
These services are optional. But when deployed, they have a standard out of box log format that is service specific.
[I 2021-04-09 11:01:43.064 ServerApp] jupyterlab | extension was successfully linked.
[I 2021-04-09 11:01:43.065 ServerApp] jupyterlab_templates | extension was successfully linked.
[W 2021-04-09 11:01:43.068 NotebookApp] Collisions detected in /home/scribble/.jupyter/jupyter_notebook_config.py and /home/scribble/.jupyter/jupyter_notebook_config.json config files. /home/scribble/.jupyter/jupyter_notebook_config.json has higher priority: {
"NotebookApp": {
"nbserver_extensions": "{'jupyterlab_git': True, 'jupyterlab_templates': True} ignored, using {'jupyterlab_templates.extension': True}"
}
}
[W 2021-04-09 11:01:43.069 NotebookApp] 'allow_remote_access' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-04-09 11:01:43.069 NotebookApp] 'base_url' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-04-09 11:01:43.069 NotebookApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-04-09 11:01:43.069 NotebookApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-04-09 11:01:43.069 NotebookApp] 'notebook_dir' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-04-09 11:01:43.069 NotebookApp] 'password_required' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-04-09 11:01:43.069 NotebookApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-04-09 11:01:43.069 NotebookApp] 'port_retries' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-04-09 11:01:43.069 NotebookApp] 'password' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2021-04-09 11:01:43.074 ServerApp] notebook_dir is deprecated, use root_dir
[W 2021-04-09 11:01:43.074 ServerApp] No such directory: ''/home/ubuntu/enrich/opt/notebooks''
[I 2021-04-09 11:01:43.081 ServerApp] Writing notebook server cookie secret to /home/scribble/.local/share/jupyter/runtime/jupyter_cookie_secret
[I 2021-04-09 11:01:43.096 LabApp] JupyterLab extension loaded from /app/scribble/enrich/.virtualenvs/jupyterenv/lib/python3.6/site-packages/jupyterlab
OS Upgrade (20.04 to 22.04)→
Prequisites:→
- Open Port 1022
- Backup of the root disk to restore if needed (CommVault/Azure/Other)
Prepare System→
https://jumpcloud.com/blog/how-to-upgrade-ubuntu-20-04-to-ubuntu-22-04
# Update the current deployment
sudo apt update && sudo apt upgrade -y
# If upgrade from 20.04 to 22.04 is required
sudo apt install update-manager-core
sudo do-release-upgrade
# For python versions
sudo add-apt-repository ppa:deadsnakes/ppa -y
sudo apt update
# Show which python version has been configured
sudo update-alternatives --config python3
sudo update-alternatives --config python
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.10 1
python3 --version # Should be present
python3.10 --version # Should be present
#
sudo pip3.10 install --update virtualenvwrapper
# Check whether you are able to source it correctly
source /usr/local/bin/virtualenvwrapper.sh
virtualenvwrapper.user_scripts creating /home/pingali/.virtualenvs/premkproject
virtualenvwrapper.user_scripts creating /home/pingali/.virtualenvs/postmkproject
# Check updated sudo acces
sudo -l
Errors→
# If update-manager-core or do-release-upgrade throws apt_pkg error
# during upgrade, check the python3 default version
sudo update-alternatives --remove python3 /usr/bin/python3
sudo update-alternatives --remove python /usr/bin/python
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.8 1
sudo apt-get install python3-apt --reinstall
# Install python3.10 (current default). Includes venv, pip
sudo apt install python3.10-full
# Make python3.10 the default
sudo update-alternatives --install /usr/bin/python3 python /usr/bin/python3.10 1
# Check and install python3.10-pip; Make sure there is only one pip
sudo pip3.10 # may or may not be present
curl -sS https://bootstrap.pypa.io/get-pip.py | sudo python3.10
sudo pip --version
sudo pip3.10 --version