Formal Molecular Biology: Rule-based modelling

Teacher: Jean Krivine (IRIF)

Distributed Systems:

y-axis: Evolved vs. Specified
x-axis: Natural vs. Synthetic

Sequential ≃ totally ordered Distribution ≃ partial order

	Natural	Synthetic
Specified	Synthetic biology (e.g. bacteria engineering)	Multicore program
Evolved	Cells	Web

Computer Science (CS) is usually about Specified/Synthetic

Usually in CS: you want to prove that an implementation $P$ is correct wrt its specification $S$ ⟹ Bissimulation: $P ≃ S$

But in evolved systems, we don’t have access to specification

⟶ First question: is there something to understand (i.e. some structure/modules)?

Answer: we hope so, but it could very well be that a cell is a bunch of “spaghetti-like” unorganized data

Experiments done:

Uri Allon: take some boolean formula $ψ$. E.g., you know that $ψ = φ_0 ∧ φ_1$

Genetic programming ⟶ consider gates (OR, AND, NOT, …) and plug them randomly. Goal: you want to measure how close the input/output link is to the specification given by $ψ$. Fitness measure: $0$ = horribly bad, $1$ = perfect

In the first generation, you generate $n$ circuits $C_1, …, C_n$, then you grade them, and the higher the grade, the higher the more likely it is to pass to the next generation (once a circuit reach fitness $1$, it is sure to be passed to future generations).

Do that for a bunch of generations, until a circuit reaches a fitness close to $1$: examine the structure of this circuit. There’s no reason whatsoever that a subformula of $ψ$ appear in the circuit.

Now, suppose you oscillate between satisfying $ψ$, and satisfying $ψ’ ≝ φ_0 ∨ φ_1$. After you changing back and forth the formula to satisfy, the maximum fitness is reached faster, and you see $φ_0$ and $φ_1$ appear as subgates eventually. Why? You’re teaching the circuit to be “plastic”, on top of your expected goal.

Takeaway message: it’s a dangerous route to try to understand the cell without keeping evolution in mind.

Systems Biology vs Molecular Biology

Systems Biology:: the biology of distributed systems: biology of cellular functions (understanding functions of the cell)

In Specified/Synthetic: there’s also hardware.

Moelcular Biology:: the “hardware” part of biology: biology of mechanisms/facts/interactions (understanding what happens inside the cell)

Biology reminders: cf pictures

Amino-acid residues make up domains (they’re the basic blocks of domains, and they can be modified by post-transcriptional modification).

Naming conventions in molecular biology are reminiscent of early alchemy, they’re based on the assumed “function” of the protein, which is very bad from a systems biology point of vue (function is derived afterwards).

Language matters

Biology papers are written in English + data & curves. But it’s a very “mechanistic” English

For instance: « EGF ligand binds to EGFR receptors which in turn are able to homodimerize »

$≃ 10^6$ biology papers/year
$3000$ papers/year about EGF only

cf. DARPA “Big Mechanism”

Biology lacks an executable language. Some workaround:

ODE (Ordinary Differential Equations), brought about by physicists

Example:
\[A+B ⟶ AB\\ AB ⟶ A+B\\ AB ⟶ AB^\ast\\ AB^\ast ⟶ A+B^\ast\] \[\frac{dx_A}{dt} = +[x_A + x_{AB^\ast}] - [x_B]\]
But: combinatorial explosion! $AB$ is comprised of $A$ and $B$ components, but you treat it as an extra variable

Kappa

A graph rewriting formalism

Terminology:

Labelled graph
Simple graph: at most one edge between two nodes
Labelled site graph
Simple/conflict-free labelled site graph: at most one edge between two site

Site are thought of as resources

Nodes: correspond to proteins
Sites: correspond to interaction capacities
Edges: contact (non-covalent bounds)

Sites can have labels to represent modifications

Graph embedding $g \hookrightarrow h$:

injective on nodes
name preserving
edge preserving

A pattern $P$ has a match $f$ in $G$ if

\[f: P \hookrightarrow G\]

Names are equipped with a signature $Σ: 𝒩 ⟶ ℕ$ that defines how many sites it has. In particular: $Σ(⊥) = 1$

We write $[P]_G$ as the set of matches of $P$ in $G$:

\[[P]_G ≝ \lbrace f: P \hookrightarrow G \rbrace\]

Share on

Twitter Facebook Google+ LinkedIn

Systems Biology vs Molecular Biology

Language matters

Kappa

Share on

Leave a comment