The Modular Genome

Written 2016-10-10

The following was a page I published in 2016 on my "Mesoplasma florum wiki". It outlined my idea of how to build a modular genome. It's not perfect, but I was in high school at the time, so give me a break.

Motivation

Over the last few years, drops in gene synthesis costs have enabled companies and universities to synthesize whole organisms. As gene synthesis costs continue to drop, soon synthesis of whole organisms will be achievable by motivated individuals. However, without a completely understood base genome, progress will be slow for labs and nearly unachievable for individuals. Some projects have attempted to circumvent this problem by ‘franchising’ their genome synthesis and by distributing testing. Taking into account the history of the free software movement, and in particular the Linux kernel development, construction of a minimal genome using wide scale distribution may be a better way to create a genome to benefit the larger scientific and global community.

Design and synthesis of a minimal bacterial genome (2016)

Total Synthesis of a Functional Designer Eukaryotic Chromosome (2014)

Design, synthesis, and testing toward a 57-codon genome (2016)

The Cathedral and the Bazaar (200, on Linux development)

Modularity Definition

A figure of the levels of abstraction in a modular genome.

A modular genome is a genome that allows for reorganization and differential expression of modules in a parallelizable way. Key points:

Allows for reorganization of genes whereas 'modules' are defined as genes or operons
Can be constructed without large amounts of synthesis, in particular, not requiring large amounts of synthesis or PCR

Modularity Advantages

Quicker testing

Modularity enables quicker testing by allowing parallelization of routine cloning reactions. For example, GoldenGate assembly would allow genome construction without PCR by directly taking modules from plasmids. Direct plasmid part based assembly reactions alleviate complexities of PCR optimization, efficient multipart assembly (with gibson), and allow for simpler large scale redistribution of tested modules. By deconstructing a genome into several tens of 'independent' modules, each module can be quickly and efficiently independently tested, and redistributed after desired modifications are made.

Simple distribution / contribution

So far, no genome engineering lab has set up a method to widely distribute their resulting genome. This flaw in distribution and subsequently contribution will and will continue to slow development of their work, as has been shown in the open source software engineering world. Phagemid based distribution methods can promise to give a single repository the means to distribute a massive number of modules for cheap. By simple ease of adoption, similar to distribution of binaries on the internet, the activation energy to begin working with a modular genome will be lowered, resulting in a cascade effect of more labs using it. The prime objective of simple and mass distribution is summed up as “Given enough eyes, all bugs become shallow”.

Large rearrangements

Ultimately, large scale changes to genomes are desired by scientists that wish to simplify and modify small genomes. In order to facilitate large scale rearrangments, such as replacing a biosynthetic pathway or heavily modifying the translation system, the DNA fragments must either be modular or recloned/synthesized. The former is desirable to save money and time, ultimately allowing more experiments.

A Modular Organism

For this modular genome, I propose Mesoplasma florum, a small 800kb BSL1 organism currently used by only a few labs. Unlike other proposed organisms, Mesoplasma florum is BSL1, fast growing, and not the intellectual property of a few large labs, making it the perfect organism for genome modularization.

The original Mesoplasma florum paper (1984)