Utils

mobius.utils.constrained_sum_sample_pos(n, total)

Return a randomly chosen list of n positive integers summing to total.

Parameters:
nint

The number of positive integers to generate.

totalint

The total sum of the generated integers.

Returns:
ndarray

A 1D numpy array of n positive integers summing to total.

Notes

https://stackoverflow.com/questions/3589214

mobius.utils.constrained_sum_sample_nonneg(n, total)

Return a randomly chosen list of n nonnegative integers summing to total.

Parameters:
nint

Number of nonnegative integers in the list.

totalint

The sum of the nonnegative integers in the list.

Returns:
ndarray

A 1D numpy array of n nonnegative integers summing to total.

Notes

https://stackoverflow.com/questions/3589214

mobius.utils.path_module(module_name)

Given a module name, return the path of the directory where the module is located. Returns None if the module does not exist.

Parameters:
module_namestr

Name of the module.

Returns:
pathstr or None

Path of the directory where the module is located, or None if the module does not exist.

mobius.utils.function_equal(func_1, func_2)

Compare two functions to see if they are identical.

Parameters:
func_1function

The first function to compare.

func_2function

The second function to compare.

Returns:
bool

True if the two functions are identical, False otherwise.

mobius.utils.opposite_signs(x, y)

Return True if x and y have opposite signs, otherwise False.

Parameters:
xfloat or int

First number to compare.

yfloat or int

Second number to compare.

Returns:
bool

True if x and y have opposite signs, False otherwise.

mobius.utils.affinity_binding_to_energy(value, unit='nM', temperature=300.0)

Convert affinity binding to energy.

Parameters:
valuefloat

Value of the affinity binding.

unitstr, default: ‘nM’

Unit of the affinity binding.

temperaturefloat, default: 300.

Temperature at which to calculate the energy.

Returns:
float

Value of the energy corresponding to the given affinity binding.

mobius.utils.energy_to_affinity_binding(value, unit='nM', temperature=300.0)

Convert energy to affinity binding.

Parameters:
valuefloat

Value of the energy.

unitstr, default: ‘nM’

Unit of the affinity binding.

temperaturefloat, default: 300.

Temperature at which to calculate the affinity binding.

Returns:
float

Value of the affinity binding corresponding to the given energy.

mobius.utils.ic50_to_pic50(value, unit=None)

Convert IC50 to pIC50.

Parameters:
valuefloat

Value of the IC50.

unitstr or None, default: None

Unit of the IC50.

Returns:
float

Value of the pIC50 corresponding to the given IC50.

mobius.utils.pic50_to_ic50(value, unit=None)

Converts a pIC50 value to IC50 value.

Parameters:
valuefloat

The pIC50 value to be converted.

unitstr, defaultNone

The unit of the IC50 value.

Returns:
float

The IC50 value after the conversion.

mobius.utils.split(n, k)

Splits a number into k parts with each part as close to the same size as possible.

Parameters:
nint

The number to be split.

kint

The number of parts to split the number into.

Returns:
ndarray

ndarray containing the parts of the split.

mobius.utils.guess_input_formats(sequences)

Guess the format for each input sequence. This function recognizes either the FASTA or the HELM format. If the format is not recognized, it will be labeled as ‘unknown’.

The regex used to recognize each format are the following: - FASTA: ^[^{}[].:,;=$-()+*

]+$
  • HELM: ^[^>

]*{[^> ]+}[^> ]*$*$$$(V2.0)?$

Parameters:
sequencesstr, List of str, or ndarray of str

Input data to be checked.

Returns:
List of str

The format of each input sequence. Can be ‘FASTA’, ‘HELM’ or ‘unknown’.

Notes

The regex were obtained using chatGPT-3.5 after some trials and many errors. You are warned.

mobius.utils.generate_random_linear_polymers(n_polymers, polymers_lengths, monomers=None, output_format='helm')

Generates random linear polymers.

Parameters:
n_polymersint

Number of random polymers to generate.

polymers_lengthsList, tuple or numpy.ndarray

List of polymers lengths to sample from.

monomersList of str, defaultNone

A list of monomers to substitute at each allowed position. If not provided, defaults to the 20 natural amino acids.

output_formatstr, default‘helm’

Output format. Can be ‘fasta’ or ‘helm’.

Returns:
ndarray

Randomly generated linear polymers.

Raises:
AssertionError: If output format is not ‘fasta’ or ‘helm’.
mobius.utils.generate_random_polymers_from_designs(n_polymers, scaffold_designs)

Generates random polymers using scaffold designs.

Parameters:
n_polymersint or list of int

Number of random polymers to generate, or list of numbers of polymers to generate per scaffold.

scaffold_designsdictionary

Dictionary with scaffold polymers and defined set of monomers to use for each position.

Returns:
ndarray

Randomly generated polymers.

mobius.utils.adjust_polymers_to_design(polymers, design)

Modify polymers to fit a given design.

Parameters:
polymersList

List of polymers in HELM format.

designdictionary

Dictionnary of all the positions allowed to be optimized.

Returns:
ndarray

Adjusted polymers in HELM format based on designs.

ndarray

ndarray of boolean values indicating whether the polymers was modified or not.

mobius.utils.group_polymers_by_scaffold(polymers, return_index=False)

Groups a list polymers in HELM format by their scaffolds.

Parameters:
polymersList of str

List of input polymers in HELM format to group.

return_indexbool, defaultFalse

Whether to return also the original index of the grouped polymers.

Returns:
groupsDict[str, List of str]

A dictionary with scaffold polymers as keys and lists of grouped polymers as values.

group_indicesDict[str, List of int]

If return_index is True, a dictionary with scaffold polymers as keys and lists of indices of the original polymers.

Examples

>>> polymers = ['PEPTIDE1{A.A.R}$$$$V2.0', 'PEPTIDE1{A.A}$$$$V2.0', 'PEPTIDE1{R.G}$$$$V2.0']
>>> groups = _group_by_scaffold(polymers)
>>> print(groups)
{'X$PEPTIDE1{$X.X.X$}$V2.0': ['PEPTIDE1{A.A.R}$$$$V2.0'], 
 'X$PEPTIDE1{$X.X$}$V2.0': ['PEPTIDE1{A.A}$$$$V2.0', 'PEPTIDE1{R.G}$$$$V2.0']}
mobius.utils.group_biopolymers_by_design(biopolymers, designs, return_index=False)

Groups a list biopolymers in FASTA format by design using the sequence lengths.

Parameters:
biopolymersList of str

List of input biopolymers in FASTA format to group.

designsDictionnary

Dictionnary containing the design protocol.

return_indexbool, defaultFalse

Whether to return also the original index of the grouped biopolymers.

Returns:
groupsDict[str, List of str]

A dictionary with the biopolymer names as keys and lists of grouped biopolymers as values.

group_indicesDict[str, List of int]

If return_index is True, a dictionary with biopolymer names as keys and lists of indices of the original biopolymers.

mobius.utils.convert_FASTA_to_HELM(sequences)

Converts one or more FASTA sequences to HELM format.

Parameters:
sequencesstr, List of str, or ndarray of str

A FASTA sequence or list/ndarray of FASTA sequences.

Returns:
List of str

A list of sequences in HELM format.

mobius.utils.convert_HELM_to_FASTA(polymers, ignore_connections=False)

Converts one or more HELM sequences to FASTA format.

Parameters:
polymersstr, List of str, or numpy.ndarray of str

A polymer or list/array of polymers in HELM format.

ignore_connectionsbool, defaultFalse

Whether to ignore connections in polymers.

Returns:
List of str

A list of sequences in FASTA format.

Raises:
ValueError

If a polymer contains connections or more than one simple polymer.

mobius.utils.build_helm_string(complex_polymer, connections=None)

Build a HELM string from a dictionary of polymers and a list of connections.

Parameters:
complex_polymerdict

A dictionary of simple polymers, where keys are the simple polymer types and values are lists of monomer symbols.

connectionsList, defaultNone

A list of connections, where each connection is represented as a tuple with six elements: (start_polymer, start_monomer, start_attachment, end_polymer, end_monomer, end_attachment).

Returns:
str

The generated polymer in HELM format.

mobius.utils.parse_helm(polymer)

Parses a HELM string and returns the relevant information.

Parameters:
polymer (str)

A polymer in HELM format.

Returns:
complex_polymerdict

A dictionary containing the simple polymer IDs (pid) as keys and simple polymer as values.

connectionsnumpy.ndarray

An array with dtype [(‘SourcePolymerID’, ‘U20’), (‘TargetPolymerID’, ‘U20’), (‘SourceMonomerPosition’, ‘i4’), (‘SourceAttachment’, ‘U2’), (‘TargetMonomerPosition’, ‘i4’), (‘TargetAttachment’, ‘U2’)]. Each row represents a connection between two monomers in the complex polymer.

hydrogen_bondsstr

A string containing information about any hydrogen bonds in the complex polymer.

attributesstr

A string containing any additional attributes related to the complex polymer.

mobius.utils.get_scaffold_from_helm_string(polymer)

Get the scaffold of the input polymer in HELM format.

Parameters:
polymerstr

A polymer in HELM format.

Returns:
str

The scaffold version of the input polymer in HELM format.

Examples

polymer : PEPTIDE1{A.C.A.A.A}|PEPTIDE2{A.A.A.A}$PEPTIDE1,PEPTIDE2,1:R3-1:R3$$$V2.0 scaffold : PEPTIDE1{X.C.X.X.X}|PEPTIDE2{X.A.X.X}$PEPTIDE1,PEPTIDE2,1:R3-1:R3$$$V2.0

mobius.utils.generate_design_protocol_from_polymers(polymers)

Generate the bare minimum design protocol yaml config from a list of polymers in HELM format.

Parameters:
polymersList of str

List of polymers in HELM format.

Returns:
dict

The design protocol yaml config.

mobius.utils.write_design_protocol_from_polymers(polymers, filename='design.yaml')

Write the bare minimum design protocol yaml file from a list of polymers in HELM format.

Parameters:
polymersList of str

List of polymers in HELM format.

filenamestr, default‘design.yaml’

Name of the design protocol yaml file to write.

mobius.utils.MolFromHELM(polymers, HELM_extra_library_filename=None)

Generate a list of RDKit molecules from HELM strings.

Parameters:
polymersstr or List or tuple or numpy.ndarray

The polymer in HELM format to convert to RDKit molecules.

HELM_extra_library_filenamestr, defaultNone

The path to a HELM Library file containing extra monomers. Extra monomers will be added to the internal monomers library. Internal monomers can be overriden by providing a monomer with the same MonomerID.

Returns:
List

A list of RDKit molecules.

mobius.utils.read_pssm_file(pssm_file)

Reads a PSSM (position-specific scoring matrix) file and returns a pandas DataFrame containing the data and the intercept value.

Parameters:
pssm_filestr

The path to the PSSM file to be read.

Returns:
pssmpandas.DataFrame

A DataFrame containing the data from the PSSM file.

interceptfloat

The intercept value from the PSSM file.

mobius.utils.global_min_pssm_score(pssm_pd, intercept)

Reads a PSSM data frame and returns the residue sequence and its corresponding globally minimum pssm score.

Parameters:
pssmpandas dataframe

The data frame of each individual pssm score for an allele.

interceptfloat

The intercept value from the PSSM file.

Returns:
global min peptidestr

String of residue corresponding to the lowest PSSM score possible for that matrix.

min_scorefloat

Associated globally minimum PSSM score for that matrix.

mobius.utils.optimisation_tracker(n, df_old, polymers, scores)

Record optimisation progression of suggested polymers.

Parameters:
nint

Optimisation number. If 0, creates dataframe for progression tracking.

df_oldpd dataframe

Old dataframe from previous optimisation round for updating.

polymers: list

List of new suggested polymers to update progression tracking.

score: ndarray

Associated scores in each objective to the list of polymers.

Returns:
dfpandas dataframe

Updated dataframe tracking optimisation progression.