Utils
- mobius.utils.constrained_sum_sample_pos(n, total)
Return a randomly chosen list of n positive integers summing to total.
- Parameters:
- nint
The number of positive integers to generate.
- totalint
The total sum of the generated integers.
- Returns:
- ndarray
A 1D numpy array of n positive integers summing to total.
Notes
- mobius.utils.constrained_sum_sample_nonneg(n, total)
Return a randomly chosen list of n nonnegative integers summing to total.
- Parameters:
- nint
Number of nonnegative integers in the list.
- totalint
The sum of the nonnegative integers in the list.
- Returns:
- ndarray
A 1D numpy array of n nonnegative integers summing to total.
Notes
- mobius.utils.path_module(module_name)
Given a module name, return the path of the directory where the module is located. Returns None if the module does not exist.
- Parameters:
- module_namestr
Name of the module.
- Returns:
- pathstr or None
Path of the directory where the module is located, or None if the module does not exist.
- mobius.utils.function_equal(func_1, func_2)
Compare two functions to see if they are identical.
- Parameters:
- func_1function
The first function to compare.
- func_2function
The second function to compare.
- Returns:
- bool
True if the two functions are identical, False otherwise.
- mobius.utils.opposite_signs(x, y)
Return True if x and y have opposite signs, otherwise False.
- Parameters:
- xfloat or int
First number to compare.
- yfloat or int
Second number to compare.
- Returns:
- bool
True if x and y have opposite signs, False otherwise.
- mobius.utils.affinity_binding_to_energy(value, unit='nM', temperature=300.0)
Convert affinity binding to energy.
- Parameters:
- valuefloat
Value of the affinity binding.
- unitstr, default: ‘nM’
Unit of the affinity binding.
- temperaturefloat, default: 300.
Temperature at which to calculate the energy.
- Returns:
- float
Value of the energy corresponding to the given affinity binding.
- mobius.utils.energy_to_affinity_binding(value, unit='nM', temperature=300.0)
Convert energy to affinity binding.
- Parameters:
- valuefloat
Value of the energy.
- unitstr, default: ‘nM’
Unit of the affinity binding.
- temperaturefloat, default: 300.
Temperature at which to calculate the affinity binding.
- Returns:
- float
Value of the affinity binding corresponding to the given energy.
- mobius.utils.ic50_to_pic50(value, unit=None)
Convert IC50 to pIC50.
- Parameters:
- valuefloat
Value of the IC50.
- unitstr or None, default: None
Unit of the IC50.
- Returns:
- float
Value of the pIC50 corresponding to the given IC50.
- mobius.utils.pic50_to_ic50(value, unit=None)
Converts a pIC50 value to IC50 value.
- Parameters:
- valuefloat
The pIC50 value to be converted.
- unitstr, defaultNone
The unit of the IC50 value.
- Returns:
- float
The IC50 value after the conversion.
- mobius.utils.split(n, k)
Splits a number into k parts with each part as close to the same size as possible.
- Parameters:
- nint
The number to be split.
- kint
The number of parts to split the number into.
- Returns:
- ndarray
ndarray containing the parts of the split.
- mobius.utils.guess_input_formats(sequences)
Guess the format for each input sequence. This function recognizes either the FASTA or the HELM format. If the format is not recognized, it will be labeled as ‘unknown’.
The regex used to recognize each format are the following: - FASTA: ^[^{}[].:,;=$-()+*
- ]+$
HELM: ^[^>
]*{[^> ]+}[^> ]*$*$$$(V2.0)?$
- Parameters:
- sequencesstr, List of str, or ndarray of str
Input data to be checked.
- Returns:
- List of str
The format of each input sequence. Can be ‘FASTA’, ‘HELM’ or ‘unknown’.
Notes
The regex were obtained using chatGPT-3.5 after some trials and many errors. You are warned.
- mobius.utils.generate_random_linear_polymers(n_polymers, polymers_lengths, monomers=None, output_format='helm')
Generates random linear polymers.
- Parameters:
- n_polymersint
Number of random polymers to generate.
- polymers_lengthsList, tuple or numpy.ndarray
List of polymers lengths to sample from.
- monomersList of str, defaultNone
A list of monomers to substitute at each allowed position. If not provided, defaults to the 20 natural amino acids.
- output_formatstr, default‘helm’
Output format. Can be ‘fasta’ or ‘helm’.
- Returns:
- ndarray
Randomly generated linear polymers.
- Raises:
- AssertionError: If output format is not ‘fasta’ or ‘helm’.
- mobius.utils.generate_random_polymers_from_designs(n_polymers, scaffold_designs)
Generates random polymers using scaffold designs.
- Parameters:
- n_polymersint or list of int
Number of random polymers to generate, or list of numbers of polymers to generate per scaffold.
- scaffold_designsdictionary
Dictionary with scaffold polymers and defined set of monomers to use for each position.
- Returns:
- ndarray
Randomly generated polymers.
- mobius.utils.adjust_polymers_to_design(polymers, design)
Modify polymers to fit a given design.
- Parameters:
- polymersList
List of polymers in HELM format.
- designdictionary
Dictionnary of all the positions allowed to be optimized.
- Returns:
- ndarray
Adjusted polymers in HELM format based on designs.
- ndarray
ndarray of boolean values indicating whether the polymers was modified or not.
- mobius.utils.group_polymers_by_scaffold(polymers, return_index=False)
Groups a list polymers in HELM format by their scaffolds.
- Parameters:
- polymersList of str
List of input polymers in HELM format to group.
- return_indexbool, defaultFalse
Whether to return also the original index of the grouped polymers.
- Returns:
- groupsDict[str, List of str]
A dictionary with scaffold polymers as keys and lists of grouped polymers as values.
- group_indicesDict[str, List of int]
If return_index is True, a dictionary with scaffold polymers as keys and lists of indices of the original polymers.
Examples
>>> polymers = ['PEPTIDE1{A.A.R}$$$$V2.0', 'PEPTIDE1{A.A}$$$$V2.0', 'PEPTIDE1{R.G}$$$$V2.0'] >>> groups = _group_by_scaffold(polymers) >>> print(groups) {'X$PEPTIDE1{$X.X.X$}$V2.0': ['PEPTIDE1{A.A.R}$$$$V2.0'], 'X$PEPTIDE1{$X.X$}$V2.0': ['PEPTIDE1{A.A}$$$$V2.0', 'PEPTIDE1{R.G}$$$$V2.0']}
- mobius.utils.group_biopolymers_by_design(biopolymers, designs, return_index=False)
Groups a list biopolymers in FASTA format by design using the sequence lengths.
- Parameters:
- biopolymersList of str
List of input biopolymers in FASTA format to group.
- designsDictionnary
Dictionnary containing the design protocol.
- return_indexbool, defaultFalse
Whether to return also the original index of the grouped biopolymers.
- Returns:
- groupsDict[str, List of str]
A dictionary with the biopolymer names as keys and lists of grouped biopolymers as values.
- group_indicesDict[str, List of int]
If return_index is True, a dictionary with biopolymer names as keys and lists of indices of the original biopolymers.
- mobius.utils.convert_FASTA_to_HELM(sequences)
Converts one or more FASTA sequences to HELM format.
- Parameters:
- sequencesstr, List of str, or ndarray of str
A FASTA sequence or list/ndarray of FASTA sequences.
- Returns:
- List of str
A list of sequences in HELM format.
- mobius.utils.convert_HELM_to_FASTA(polymers, ignore_connections=False)
Converts one or more HELM sequences to FASTA format.
- Parameters:
- polymersstr, List of str, or numpy.ndarray of str
A polymer or list/array of polymers in HELM format.
- ignore_connectionsbool, defaultFalse
Whether to ignore connections in polymers.
- Returns:
- List of str
A list of sequences in FASTA format.
- Raises:
- ValueError
If a polymer contains connections or more than one simple polymer.
- mobius.utils.build_helm_string(complex_polymer, connections=None)
Build a HELM string from a dictionary of polymers and a list of connections.
- Parameters:
- complex_polymerdict
A dictionary of simple polymers, where keys are the simple polymer types and values are lists of monomer symbols.
- connectionsList, defaultNone
A list of connections, where each connection is represented as a tuple with six elements: (start_polymer, start_monomer, start_attachment, end_polymer, end_monomer, end_attachment).
- Returns:
- str
The generated polymer in HELM format.
- mobius.utils.parse_helm(polymer)
Parses a HELM string and returns the relevant information.
- Parameters:
- polymer (str)
A polymer in HELM format.
- Returns:
- complex_polymerdict
A dictionary containing the simple polymer IDs (pid) as keys and simple polymer as values.
- connectionsnumpy.ndarray
An array with dtype [(‘SourcePolymerID’, ‘U20’), (‘TargetPolymerID’, ‘U20’), (‘SourceMonomerPosition’, ‘i4’), (‘SourceAttachment’, ‘U2’), (‘TargetMonomerPosition’, ‘i4’), (‘TargetAttachment’, ‘U2’)]. Each row represents a connection between two monomers in the complex polymer.
- hydrogen_bondsstr
A string containing information about any hydrogen bonds in the complex polymer.
- attributesstr
A string containing any additional attributes related to the complex polymer.
- mobius.utils.get_scaffold_from_helm_string(polymer)
Get the scaffold of the input polymer in HELM format.
- Parameters:
- polymerstr
A polymer in HELM format.
- Returns:
- str
The scaffold version of the input polymer in HELM format.
Examples
polymer : PEPTIDE1{A.C.A.A.A}|PEPTIDE2{A.A.A.A}$PEPTIDE1,PEPTIDE2,1:R3-1:R3$$$V2.0 scaffold : PEPTIDE1{X.C.X.X.X}|PEPTIDE2{X.A.X.X}$PEPTIDE1,PEPTIDE2,1:R3-1:R3$$$V2.0
- mobius.utils.generate_design_protocol_from_polymers(polymers)
Generate the bare minimum design protocol yaml config from a list of polymers in HELM format.
- Parameters:
- polymersList of str
List of polymers in HELM format.
- Returns:
- dict
The design protocol yaml config.
- mobius.utils.write_design_protocol_from_polymers(polymers, filename='design.yaml')
Write the bare minimum design protocol yaml file from a list of polymers in HELM format.
- Parameters:
- polymersList of str
List of polymers in HELM format.
- filenamestr, default‘design.yaml’
Name of the design protocol yaml file to write.
- mobius.utils.MolFromHELM(polymers, HELM_extra_library_filename=None)
Generate a list of RDKit molecules from HELM strings.
- Parameters:
- polymersstr or List or tuple or numpy.ndarray
The polymer in HELM format to convert to RDKit molecules.
- HELM_extra_library_filenamestr, defaultNone
The path to a HELM Library file containing extra monomers. Extra monomers will be added to the internal monomers library. Internal monomers can be overriden by providing a monomer with the same MonomerID.
- Returns:
- List
A list of RDKit molecules.
- mobius.utils.read_pssm_file(pssm_file)
Reads a PSSM (position-specific scoring matrix) file and returns a pandas DataFrame containing the data and the intercept value.
- Parameters:
- pssm_filestr
The path to the PSSM file to be read.
- Returns:
- pssmpandas.DataFrame
A DataFrame containing the data from the PSSM file.
- interceptfloat
The intercept value from the PSSM file.
- mobius.utils.global_min_pssm_score(pssm_pd, intercept)
Reads a PSSM data frame and returns the residue sequence and its corresponding globally minimum pssm score.
- Parameters:
- pssmpandas dataframe
The data frame of each individual pssm score for an allele.
- interceptfloat
The intercept value from the PSSM file.
- Returns:
- global min peptidestr
String of residue corresponding to the lowest PSSM score possible for that matrix.
- min_scorefloat
Associated globally minimum PSSM score for that matrix.
- mobius.utils.optimisation_tracker(n, df_old, polymers, scores)
Record optimisation progression of suggested polymers.
- Parameters:
- nint
Optimisation number. If 0, creates dataframe for progression tracking.
- df_oldpd dataframe
Old dataframe from previous optimisation round for updating.
- polymers: list
List of new suggested polymers to update progression tracking.
- score: ndarray
Associated scores in each objective to the list of polymers.
- Returns:
- dfpandas dataframe
Updated dataframe tracking optimisation progression.