Assets
enrichsdk.contrib.lib.assets
→
Reusable library of modules (e.g., base classes and algorithms) to be incorporated into transforms
anonymizer
→
AnonymizerMixin
→
Bases: object
Embed the core anonymization functions in transforms
anonymize_init(anonargs)
→
Initialize the anonymizer
Source code in enrichsdk/contrib/lib/assets/anonymizer.py
BaseAnonymizer(textgen_cred, *args, **kwargs)
→
Bases: object
default
Source code in enrichsdk/contrib/lib/assets/anonymizer.py
anon_categorical(df, col_name, column)
→
Method to anonymize categorical data. Various anonymization methods can be defined here. Input is the full dataframe, output is the relavant column being anonymized.
Source code in enrichsdk/contrib/lib/assets/anonymizer.py
anon_email(df, col_name, column)
→
Method to anonymize email data. Can generate emails to match or not match data in some name field. Also respects original email domain distribution if required. Input is the full dataframe, output is the relavant column being anonymized.
Source code in enrichsdk/contrib/lib/assets/anonymizer.py
350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 |
|
anon_numeric(df, col_name, column)
→
Method to fuzz numeric data. Various fuzzing methods can be defined here. Input is the full dataframe, output is the relavant column being fuzzed.
Source code in enrichsdk/contrib/lib/assets/anonymizer.py
anonymize_dataset(df, spec={})
→
Anonymize a dataset given a spec. The spec defines how the dataset should be handled and what kinds of anonymization needs to be performed. If no spec is given, we infer one from the dataset.
Source code in enrichsdk/contrib/lib/assets/anonymizer.py
159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 |
|
anonymize_single_column(col_name, col_obj, df, params={})
→
Takes a dataset and anonymizes the specified column
Source code in enrichsdk/contrib/lib/assets/anonymizer.py
CachingAnonymizer(cachepath, *args, **kwargs)
→
Bases: BaseAnonymizer
Cache results of the classification
Source code in enrichsdk/contrib/lib/assets/anonymizer.py
anonymize_dataset(df, spec={})
→
Filter out column that neednt be computed
Source code in enrichsdk/contrib/lib/assets/anonymizer.py
changepoints
→
BaseChangePointDetectorModel(df, *args, **kwargs)
→
Bases: object
Base class for change point detection
default
Source code in enrichsdk/contrib/lib/assets/changepoints.py
detect_changepoints(method='hybrid')
→
detect change points using the method specified
Source code in enrichsdk/contrib/lib/assets/changepoints.py
detect_changepoints_hybrid(strategy=None, model=None, penalty=None, jump=None, min_size=None)
→
run the change point detector using the Hybrid method with one of the following strategies strict: all models should agree on the changepoints maxvote: a majority of models evaluated should agree on the changepoints (default strategy) anyone: all changepoints detected by any model are included select: the specified model is run
Source code in enrichsdk/contrib/lib/assets/changepoints.py
detect_changepoints_pelt(model=None, penalty=None, jump=None, min_size=None)
→
run the change point detector using the Pelt method
Source code in enrichsdk/contrib/lib/assets/changepoints.py
get_changepoints()
→
datascorer
→
BaseDataScorer()
→
Bases: object
Class to process data and produce some usable output This class is generic enough to handle arbitrary pandas dataframes
init the class
Source code in enrichsdk/contrib/lib/assets/datascorer.py
process(df, spec)
→
Take a form5500 features dataframe and a spec and apply the spec to the dataframe
Source code in enrichsdk/contrib/lib/assets/datascorer.py
llmtextgen
→
LLMTextGenerator(cred, *args, **kwargs)
→
Bases: object
default
Source code in enrichsdk/contrib/lib/assets/llmtextgen.py
call_completion_api(prompt, model)
→
Make a call to OpenAI API to get the text completion
Source code in enrichsdk/contrib/lib/assets/llmtextgen.py
call_embedding_api(prompt, model)
→
Make a call to OpenAI API to get the embedding
Source code in enrichsdk/contrib/lib/assets/llmtextgen.py
generate_code(**kwargs)
→
generate a code completion given a prompt
generate_common(task, **kwargs)
→
generate a completion given a prompt
Source code in enrichsdk/contrib/lib/assets/llmtextgen.py
generate_embedding(**kwargs)
→
generate an embedding vector given some text
generate_text(**kwargs)
→
generate a text completion given a prompt
get_api_key(cred)
→
get the API key from the cred
get_model()
→
set_model(task, model)
→
set the model to use for text completion
Source code in enrichsdk/contrib/lib/assets/llmtextgen.py
profilespec
→
get_profile_from_api(clsobj, spec_category)
→
Read the profile json from API
Source code in enrichsdk/contrib/lib/assets/profilespec.py
get_profile_from_file(clsobj)
→
Read the profile json from profilespec
Source code in enrichsdk/contrib/lib/assets/profilespec.py
timeseries_forecasting
→
File contains different classes for time series forecasting
Classes: BaseProphetForecasterModel - Class that uses prophet library to forecast
BaseProphetForecasterModel(df, *args, **kwargs)
→
Bases: object
Base class for time series forecasting
defaults
Source code in enrichsdk/contrib/lib/assets/timeseries_forecasting.py
visualize_forecasting(forecast, chart_params)
→
visualize the changepoints