Using permutations in a biological example

Rea Kalampaliki
Geek Culture
Published in
3 min readMay 30, 2021

--

What I learned the 3rd day studying Statistics

Photo by Bud Helisson on Unsplash

Introduction

Peptides and proteins are macromolecules, meaning that they are constructed by smaller molecules, amino acids bonded in chains. Briefly peptides and proteins can be described as sequences of amino acids.

Picture by drsusanmarra.com

During my previous experience in the class or in the lab, I used to inspect peptides (or proteins) by their biological role, their 3D structure, their interactions with other molecules and their biochemical profil in general.

My brief exposure to Combinatorics created an additional perspective for me to think of proteins and peptides.

Case study

Immunoglobulin heavy diversity 1–1 (also known as IGHD1–1) is a peptide that researchers have found to be part of antibodies heavy chain participating in antigen recognition and building of immunity.

IGHD1–1 has the following amino acid sequence: GTTGT

The amino acid sequence of IGHD1–1 is a group of 5 ordered elements!

Note that, in this 5-element group the animo acid Glycine (G) appears 2 times and the amino acid Threonine (T) appears 3 times.

All together, the amino acid sequence of the peptide IGHD1–1 is a group of 5 ordered elements (n=5) from the set X = {G, T}, where the element G is used twice (n1=2) and the element T is used three times (n2=3).

According to the rules of Combinatorics, the number of permutations of 5 ordered elements (n=5) from the set X = {G, T}, where the element G is used twice (n1=2) and the element T is used three times (n2=3), is equal to:

The number of permutations is quite small for us to try vizualising them, using a tree diagram.

IGHD1–1 amino acid sequence (GTTGT) is 1 of the 10 permutations of the elements {G, T}, using G twice and T three times. Are the remaining 9 amino acid combinations (GGTTT, GTTTG, TGTGT, ….) completely unused by nature?

With some further researche in the UniProt protein database I came to find that none of the remaining 9 amino acid combinations is registered in the database. That means that either they haven’t been discovered yet or they are don’t exist at all!

Concept: Nature is making choices out of many many options

When seeing a peptide from a Combinatorics perspective nature might appear to be picky. In case of IGHD1–1, we could say that nature might have picked up 1 out of 10 amino acids combination to form a functional peptide. In case of longer peptides and proteins, nature could appear even pickier since the number of possible combinations get higher with the length.

We just have to think that for a peptide 10-amino acid long the conjugator of the fragment we used to compute the number of permutations is going to be be 10! which is equal to 3,628,800.

About my learning journey

My goal is to learn a bit of Statistics everyday for the next 21 days. I am going to study the basics to solidify my knowledge on Statistics and build a strong background for more advanced Data Science concepts.

This challenge is part of a bigger one, the #66daysofdata challenge! To learn more about the #66daysofdata challenge click here and here.

Resource

Introduction in Probability and Statistics, George Papadopoulos, Gutenberg (in greek)

Thank you for reading

To see the code for my UniProt analysis click here

--

--

Rea Kalampaliki
Geek Culture

Biotechnology Graduate | Data Science Enthusiast | Europe | Greece