prosstt.tree module

This module contains the definition of the Tree class. The Tree class describes a lineage tree. Each object contains information about the topology of the lineage tree and the gene expression for each gene at each point of the tree.

class prosstt.tree.Tree(topology=[['A', 'B'], ['A', 'C']], time={'A': 40, 'B': 40, 'C': 40}, num_branches=3, branch_points=1, modules=None, G=500, density=None, root=None)

Bases: object

Formalization of a lineage tree.

topology

list of lists – Each nested list contains a connection from one branch to another

time

dict – The length of each branch in pseudotime units

num_branches

int – Total number of branches

branch_points

int – Total number of branch points

modules

int – Total number of expression programs for the lineage tree

G

int – Total number of genes

means

Series – Average gene expression per gene per branch

branches

list – List of the branch names

root

str – Name of the branch that contains the tree root

density

Series – Density of cells at each part of the lineage tree

add_genes(*args)

Sets the average gene expression trajectories of genes for all branches after performing a sanity check. Calls either _add_genes_from_relative or _add_genes_from_average.

as_dictionary()

Converts the tree topology to a dictionary where the ID of every branch points to the branches that bifurcate from it.

Returns:The topology of the tree in dictionary form.
Return type:dict
branch_times()

Calculates the pseudotimes at which branches start and end.

Returns:branch_time – Dictionary that contains the start and end time for every branch.
Return type:dict

Examples

>>> from prosstt.tree import Tree
>>> t = Tree.from_topology([[0,1], [0,2]])
>>> t.branch_times()
defaultdict(<class 'list'>, {0: [0, 39], 1: [40, 79], 2: [40, 79]})
def_genes = 500
def_time = 40
default_density()

Initializes the density with a uniform distribution (every cell has the same probability of being picked. This is in case the users want to use the density sampling function.

default_gene_expression()

Wrapper that simulates average gene expression values along the lineage tree by calling appropriate functions with default parameters.

classmethod from_newick(newick_tree, modules=None, genes=500, density=None)

Generate a lineage tree from a Newick-formatted string.

classmethod from_random_topology(branch_points, time, modules, genes)

Generate a random binary tree topology given a number of branch points.

static gen_random_topology(branch_points)

Generates a random topology for a lineage tree. At every branch point a bifurcation is taking place.

Parameters:branch_points (int) – The number of branch points in the topology
get_max_time()

Calculate the maximum pseudotime duration possible for the tree.

Returns:start – Name of the starting node.
Return type:str
get_parallel_branches()

Find the branches that run in parallel (i.e. share a parent branch).

morph_stack(stack)

The pseudotime start and end of every branch in a path. Very similar to branch_times().

Parameters:stack (int array) – The pseudotime length of all branches that make up a path in the tree (from the origin to a leaf).
Returns:stack – The pseudotime start and end of every branch in the path.
Return type:list of 2D arrays
paths(start)

Finds all paths from a given start point to the leaves.

Parameters:start (str) – The starting point.
Returns:rooted_paths – An array that contains all paths from the starting point to all tree leaves.
Return type:int array
populate_timezone()

Returns an array that assigns pseudotime to time zones.

This function first determines the timezones by considering the length of the branches and then assigns a timezone to each pseudotime range. E.g. for Ts = [25, 25, 30] we would have timezone[0:24] = 0, timezone[25:49] = 1, timezone[50:54] = 2.

Returns:
  • timezone (int array) – Array of length total_time, contains the timezone information for each pseudotime point.
  • updated_Ts (int array) – Converts from relative time to absolute time: given Ts=[25,25,25,25,25] branch 0 starts at pseudotime 0, but branches 1 and 2 start at pseudotime 25 and branches 3,4 at pseudotime 50.
set_density(density)

Sets the density as a function of the pseudotime and the branching. If N points from the tree were picked randomly, then the density is the probability of a pseudotime point in a certain branch being picked.

Parameters:density (dict) – The density of each branch. For each branch b, len(density[b]) must equal tree.time[b].