abaco.utils module#

abaco.utils.assert_nonempty_keys(dictionary: dict)[source]#

Check that the keys in a dictionary are not empty strings.

Parameters:

dictionary (dict) – A dictionary (e.g., config file).

Raises:

AssertionError – If dictionary is not a dict or if any key is empty or blank.

abaco.utils.assert_nonempty_vals(dictionary: dict)[source]#

Check that the values in a dictionary are not empty strings.

Parameters:

dictionary (dict) – A dictionary (e.g., config file).

Raises:

AssertionError – If dictionary is not a dict or if any value is empty or blank.

abaco.utils.assert_path(filepath: str)[source]#

Check that the given filepath is a string and that it exists.

Parameters:

filepath (str) – The filepath or folder path to check.

Raises:

Example

>>> assert_path("..")
>>> assert_path("./tests")
abaco.utils.create_folder(directory_path: str, is_nested: bool = False) bool[source]#

Create a folder if it doesn’t exist.

Parameters:
  • directory_path (str) – The path of the directory to create.

  • is_nested (bool, optional) – Whether to create nested directories (True uses os.makedirs, False uses os.mkdir), by default False.

Returns:

True if the folder was created, False if it already existed.

Return type:

bool

Raises:
  • TypeError – If directory_path is not a string.

  • ValueError – If directory_path is an existing file.

  • OSError – If there is an error creating the directory.

abaco.utils.df_joiner(df_dict: dict[DataFrame], on: str, how: str = 'outer') DataFrame[source]#

Join multiple dataframes on a common column.

Parameters:
  • df_dict (dict of pandas.DataFrame) – Dictionary of dataframes to join.

  • on (str, optional) – Column to join on. Defaults to “taxa”.

  • how (str, optional) – Type of join. Defaults to “outer”.

Returns:

Joined dataframe.

Return type:

pandas.DataFrame

abaco.utils.generate_log_filename(folder: str = 'logs', suffix: str = '') str[source]#

Create a log file name and path.

Parameters:
  • folder (str, optional) – Name of the folder to put the log file in (default is “logs”).

  • suffix (str, optional) – Additional string to add to the log file name (default is “”).

Returns:

The file path to the log file.

Return type:

str

abaco.utils.get_args(prog_name: str, others: dict = None)[source]#

Initiate argparse.ArgumentParser() and add common arguments.

Parameters:
  • prog_name (str) – The name of the program.

  • others (dict, optional) – Additional keyword arguments for ArgumentParser, by default {}.

Returns:

Parsed command-line arguments.

Return type:

argparse.Namespace

Raises:

TypeError – If prog_name is not a string or others is not a dict.

abaco.utils.get_basename(fname: None | str = None) str[source]#

Get the basename of a given filename, without file extension.

If no filename is given, returns the basename of the current script.

Parameters:

fname (str or None, optional) – The filename to get basename of, or None (default is None).

Returns:

Basename of the given filepath or the current file the function is executed in.

Return type:

str

abaco.utils.get_logger()[source]#

Initialize and return a logger with a log file named after the current script.

Returns:

Configured logger object.

Return type:

logging.Logger

abaco.utils.get_time(incl_time: bool = True, incl_timezone: bool = True) str[source]#

Get current date, time (optional), and timezone (optional) for file naming.

Parameters:
  • incl_time (bool, optional) – Whether to include timestamp in the string (default is True).

  • incl_timezone (bool, optional) – Whether to include the timezone in the string (default is True).

Returns:

String including date, timestamp and/or timezone, e.g. ‘yyyyMMdd_hhmm_timezone’.

Return type:

str

Raises:
  • TypeError – If incl_time or incl_timezone are not bool.

  • AssertionError – If the output format is not as expected.

abaco.utils.init_log(filename: str, display: bool = False, logger_id: str | None = None)[source]#

Configure a custom Python logger with file and optional stdout handlers.

Parameters:
  • filename (str) – Filepath to log record file.

  • display (bool, optional) – Whether to print the logs to standard output (default is False).

  • logger_id (str or None, optional) – An optional identifier for the logger. If None, defaults to ‘root’.

Returns:

Configured logger object.

Return type:

logging.Logger

Raises:

TypeError – If filename is not a string or logger_id is not a string or None.

abaco.utils.normalize_url(host: str, port: int, scheme: str = 'http') str[source]#

Normalize the given URL, ensuring it starts with the specified scheme.

Parameters:
  • host (str) – The host to be normalized.

  • port (int) – The port number.

  • scheme (str, optional) – The URL scheme (default is “http”).

Returns:

The normalized URL.

Return type:

str

Raises:

TypeError – If host, port, or scheme are not of the correct type, or if URL cannot be normalized.

Examples

>>> normalize_url("localhost", 7474)
'http://localhost:7474'
>>> normalize_url("example.com", 80, "bolt")
'bolt://example.com:80'