Utilities¶
Collection of utility functions.
quote_identifier ¶
quote_identifier(
identifier: str, quote_character: str = '"'
) -> str
Quotes the given identifier by surrounding it with the specified quote character.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
identifier
|
str
|
The identifier to be quoted. |
required |
quote_character
|
str
|
The character to use for quoting. Defaults to '"'. |
'"'
|
Returns:
Type | Description |
---|---|
str
|
The quoted identifier. |
Example:
from blueno.utils import quote_identifier
quote_identifier("my_object")
'"my_object"'
quote_identifier("my_object", "'")
"'my_object'"
remove_none ¶
remove_none(obj: Union[Dict, List]) -> Union[Dict, List]
Recursively remove None values from dictionaries and lists.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj
|
Union[Dict, List]
|
The data structure to clean. |
required |
Returns:
Type | Description |
---|---|
Union[Dict, List]
|
A new data structure with None values removed. |
separator_indices ¶
separator_indices(string: str, separator: str) -> list[int]
Find indices of a separator character in a string, ignoring separators inside quotes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
string
|
str
|
The input string to search through |
required |
separator
|
str
|
The separator character to find |
required |
Returns:
Type | Description |
---|---|
list[int]
|
A list of indices where the separator character appears outside of quotes |
Example:
from blueno.utils import separator_indices
separator_indices('a,b,"c,d",e', ",")
[1, 8]
shorten_dict_values ¶
shorten_dict_values(
obj: Union[List, Dict], max_length: int = 20
) -> Union[List, Dict]
Recursively shorten string values in dictionaries and lists. Useful for printing out data structures in a readable format.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj
|
Union[List, Dict]
|
The data structure to shorten. |
required |
max_length
|
int
|
The maximum length of string values to shorten. |
20
|
Returns:
Type | Description |
---|---|
Union[List, Dict]
|
A new data structure with string values shortened. |
get_max_column_value ¶
get_max_column_value(
table_or_uri: str | DeltaTable, column_name: str
) -> Any
Retrieves the maximum value of the specified column from a Delta table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table_or_uri
|
str | DeltaTable
|
A string URI to a Delta table or a DeltaTable instance. |
required |
column_name
|
str
|
The name of the column. |
required |
Returns:
Type | Description |
---|---|
Any
|
The maximum value of the column, or 0 if the table does not exist. |
Example: ```python notest from blueno.utils import get_max_column_value
max_value = get_max_column_value("path/to/delta_table", "incremental_id") ```
get_min_column_value ¶
get_min_column_value(
table_or_uri: str | DeltaTable, column_name: str
) -> Any
Retrieves the maximum value of the specified column from a Delta table.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table_or_uri
|
str | DeltaTable
|
A string URI to a Delta table or a DeltaTable instance. |
required |
column_name
|
str
|
The name of the column. |
required |
Returns:
Type | Description |
---|---|
Any
|
The minimum value of the column, or 0 if the table does not exist. |
Example: ```python notest from blueno.utils import get_min_column_value
min_value = get_min_column_value("path/to/delta_table", "incremental_id") ```
get_or_create_delta_table ¶
get_or_create_delta_table(
table_uri: str, schema: Schema | Schema
) -> DeltaTable
Retrieves a Delta table or creates a new one if it does not exist.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
table_uri
|
str
|
The URI of the Delta table. |
required |
schema
|
Schema | Schema
|
The Polars or PyArrow schema to create the Delta table with. |
required |
Returns:
Type | Description |
---|---|
DeltaTable
|
The Delta table. |
build_merge_predicate ¶
build_merge_predicate(
columns: list[str],
source_alias: str = "source",
target_alias: str = "target",
) -> str
Constructs a SQL merge predicate based on the provided column names.
This function generates a string that represents the condition for merging records based on equality of the specified columns.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
columns
|
list[str]
|
A list of column names to be used in the merge predicate. |
required |
source_alias
|
str
|
An alias for the source |
'source'
|
target_alias
|
str
|
An alias for the target |
'target'
|
Returns:
Type | Description |
---|---|
str
|
A SQL string representing the merge predicate. |
Example:
from blueno.utils import build_merge_predicate
predicate = build_merge_predicate(['id', 'name'])
print(predicate)
"""
(target."id" = source."id") AND (target."name" = source."name")
"""
build_when_matched_update_columns ¶
build_when_matched_update_columns(
columns: list[str],
source_alias: str = "source",
target_alias: str = "target",
) -> dict[str, str]
Constructs a mapping of columns to be updated when a match is found.
This function generates a dictionary where the keys are the target column names and the values are the corresponding source column names.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
columns
|
list[str]
|
A list of column names to be used in the update mapping. |
required |
source_alias
|
str
|
An alias for the source |
'source'
|
target_alias
|
str
|
An alias for the target |
'target'
|
Returns:
Type | Description |
---|---|
dict[str, str]
|
A dictionary mapping target columns to source columns. |
Example:
from blueno.utils import build_when_matched_update_columns
update_columns = build_when_matched_update_columns(["id", "name"])
print(update_columns)
{'target."id"': 'source."id"', 'target."name"': 'source."name"'}
build_when_matched_update_predicate ¶
build_when_matched_update_predicate(
columns: list[str],
source_alias: str = "source",
target_alias: str = "target",
) -> str
Constructs a SQL predicate for when matched update conditions.
This function generates a string that represents the conditions for updating records when a match is found based on the specified columns.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
columns
|
list[str]
|
A list of column names to be used in the update predicate. |
required |
source_alias
|
str
|
An alias for the source |
'source'
|
target_alias
|
str
|
An alias for the target |
'target'
|
Returns:
Type | Description |
---|---|
str
|
A SQL string representing the when matched update predicate. |
Example:
from blueno.utils import build_when_matched_update_predicate
update_predicate = build_when_matched_update_predicate(['id', 'status'])
print(update_predicate)
"""
(
(target."id" != source."id")
OR (target."id" IS NULL AND source."id" IS NOT NULL)
OR (target."id" IS NOT NULL AND source."id" IS NULL)
) OR ...
"""
character_translation ¶
character_translation(
text: str, translation_map: Dict[str, str]
) -> str
Translate characters in a string using a translation map.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text
|
str
|
The string to translate. |
required |
translation_map
|
Dict[str, str]
|
A dictionary mapping characters to their replacements. |
required |
Returns:
Type | Description |
---|---|
str
|
The translated string. |
Example:
from blueno.utils import character_translation
character_translation("Profit&Loss", {"&": "_and"})
"Profit_and_Loss"
to_snake_case ¶
to_snake_case(text: str) -> str
Convert a string to snake case.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text
|
str
|
The string to convert to snake case. Can be converted from PascalCase, camelCase, kebab-case, or mixed case. Non-alphanumeric characters are converted to underscores. |
required |
Returns:
Type | Description |
---|---|
str
|
The string in snake case. |
Example:
from blueno.utils import to_snake_case
to_snake_case("CustomerID")
"customer_id"