Skip to content

Utilities

Collection of utility functions.

quote_identifier

quote_identifier(
    identifier: str, quote_character: str = '"'
) -> str

Quotes the given identifier by surrounding it with the specified quote character.

Parameters:

Name Type Description Default
identifier str

The identifier to be quoted.

required
quote_character str

The character to use for quoting. Defaults to '"'.

'"'

Returns:

Type Description
str

The quoted identifier.

Example:

from blueno.utils import quote_identifier

quote_identifier("my_object")
'"my_object"'

quote_identifier("my_object", "'")
"'my_object'"

remove_none

remove_none(obj: Union[Dict, List]) -> Union[Dict, List]

Recursively remove None values from dictionaries and lists.

Parameters:

Name Type Description Default
obj Union[Dict, List]

The data structure to clean.

required

Returns:

Type Description
Union[Dict, List]

A new data structure with None values removed.

separator_indices

separator_indices(string: str, separator: str) -> list[int]

Find indices of a separator character in a string, ignoring separators inside quotes.

Parameters:

Name Type Description Default
string str

The input string to search through

required
separator str

The separator character to find

required

Returns:

Type Description
list[int]

A list of indices where the separator character appears outside of quotes

Example:

from blueno.utils import separator_indices

separator_indices('a,b,"c,d",e', ",")
[1, 8]

shorten_dict_values

shorten_dict_values(
    obj: Union[List, Dict], max_length: int = 20
) -> Union[List, Dict]

Recursively shorten string values in dictionaries and lists. Useful for printing out data structures in a readable format.

Parameters:

Name Type Description Default
obj Union[List, Dict]

The data structure to shorten.

required
max_length int

The maximum length of string values to shorten.

20

Returns:

Type Description
Union[List, Dict]

A new data structure with string values shortened.

get_max_column_value

get_max_column_value(
    table_or_uri: str | DeltaTable, column_name: str
) -> Any

Retrieves the maximum value of the specified column from a Delta table.

Parameters:

Name Type Description Default
table_or_uri str | DeltaTable

A string URI to a Delta table or a DeltaTable instance.

required
column_name str

The name of the column.

required

Returns:

Type Description
Any

The maximum value of the column, or 0 if the table does not exist.

Example: ```python notest from blueno.utils import get_max_column_value

max_value = get_max_column_value("path/to/delta_table", "incremental_id") ```

get_min_column_value

get_min_column_value(
    table_or_uri: str | DeltaTable, column_name: str
) -> Any

Retrieves the maximum value of the specified column from a Delta table.

Parameters:

Name Type Description Default
table_or_uri str | DeltaTable

A string URI to a Delta table or a DeltaTable instance.

required
column_name str

The name of the column.

required

Returns:

Type Description
Any

The minimum value of the column, or 0 if the table does not exist.

Example: ```python notest from blueno.utils import get_min_column_value

min_value = get_min_column_value("path/to/delta_table", "incremental_id") ```

get_or_create_delta_table

get_or_create_delta_table(
    table_uri: str, schema: Schema | Schema
) -> DeltaTable

Retrieves a Delta table or creates a new one if it does not exist.

Parameters:

Name Type Description Default
table_uri str

The URI of the Delta table.

required
schema Schema | Schema

The Polars or PyArrow schema to create the Delta table with.

required

Returns:

Type Description
DeltaTable

The Delta table.

build_merge_predicate

build_merge_predicate(
    columns: list[str],
    source_alias: str = "source",
    target_alias: str = "target",
) -> str

Constructs a SQL merge predicate based on the provided column names.

This function generates a string that represents the condition for merging records based on equality of the specified columns.

Parameters:

Name Type Description Default
columns list[str]

A list of column names to be used in the merge predicate.

required
source_alias str

An alias for the source

'source'
target_alias str

An alias for the target

'target'

Returns:

Type Description
str

A SQL string representing the merge predicate.

Example:

from blueno.utils import build_merge_predicate

predicate = build_merge_predicate(['id', 'name'])
print(predicate)
"""
    (target."id" = source."id") AND (target."name" = source."name")
"""

build_when_matched_update_columns

build_when_matched_update_columns(
    columns: list[str],
    source_alias: str = "source",
    target_alias: str = "target",
) -> dict[str, str]

Constructs a mapping of columns to be updated when a match is found.

This function generates a dictionary where the keys are the target column names and the values are the corresponding source column names.

Parameters:

Name Type Description Default
columns list[str]

A list of column names to be used in the update mapping.

required
source_alias str

An alias for the source

'source'
target_alias str

An alias for the target

'target'

Returns:

Type Description
dict[str, str]

A dictionary mapping target columns to source columns.

Example:

from blueno.utils import build_when_matched_update_columns

update_columns = build_when_matched_update_columns(["id", "name"])
print(update_columns)

{'target."id"': 'source."id"', 'target."name"': 'source."name"'}

build_when_matched_update_predicate

build_when_matched_update_predicate(
    columns: list[str],
    source_alias: str = "source",
    target_alias: str = "target",
) -> str

Constructs a SQL predicate for when matched update conditions.

This function generates a string that represents the conditions for updating records when a match is found based on the specified columns.

Parameters:

Name Type Description Default
columns list[str]

A list of column names to be used in the update predicate.

required
source_alias str

An alias for the source

'source'
target_alias str

An alias for the target

'target'

Returns:

Type Description
str

A SQL string representing the when matched update predicate.

Example:

from blueno.utils import build_when_matched_update_predicate

update_predicate = build_when_matched_update_predicate(['id', 'status'])
print(update_predicate)
"""
    (
        (target."id" != source."id")
        OR (target."id" IS NULL AND source."id" IS NOT NULL)
        OR (target."id" IS NOT NULL AND source."id" IS NULL)
    ) OR ...
"""

character_translation

character_translation(
    text: str, translation_map: Dict[str, str]
) -> str

Translate characters in a string using a translation map.

Parameters:

Name Type Description Default
text str

The string to translate.

required
translation_map Dict[str, str]

A dictionary mapping characters to their replacements.

required

Returns:

Type Description
str

The translated string.

Example:

from blueno.utils import character_translation

character_translation("Profit&Loss", {"&": "_and"})
"Profit_and_Loss"

to_snake_case

to_snake_case(text: str) -> str

Convert a string to snake case.

Parameters:

Name Type Description Default
text str

The string to convert to snake case. Can be converted from PascalCase, camelCase, kebab-case, or mixed case. Non-alphanumeric characters are converted to underscores.

required

Returns:

Type Description
str

The string in snake case.

Example:

from blueno.utils import to_snake_case

to_snake_case("CustomerID")
"customer_id"