When You Can’t Choose Between Biology and Mathematics, Then Choose Bioinformatics
22:19 - 27 April, 2023

When You Can’t Choose Between Biology and Mathematics, Then Choose Bioinformatics

It all started with biology lessons at school. Siras Hakobyan had an excellent biology teacher and liked that subject very much. At the same time, he was interested in mathematics and computer science.

When he was finishing school and needed to choose a profession, his biology teacher advised him to select bioinformatics, which combines biology, mathematics, and computer science. Following his advice, Siras started his study at the bioinformatics department of the Faculty of Biology at the Yerevan State University. After completing his bachelor's degree, one of the students told him about the Institute of Molecular Biology, where there was a scientific Group of Bioinformatics․


Siras Hakobyan

Siras started as a volunteer in the Group of Bioinformatics, then became a full member of the group and he continues his scientific activity in the field of genome bioinformatics to this day.

 

Data in Genome Bioinformatics

In classical biology, scientists conduct experiments and obtain data based on these experiments.

"Before the emergence of new technologies, biologists worked with data without programming, without mathematics: They noticed something with a microscope or by observing, then made some assumptions,” says Siras Hakobyan. 

But now the information has increased so much that it is impossible to do it by just observing, and scientists have to use certain computer and digital methods and algorithms.

That is why a new field of science has emerged: bioinformatics, which includes biology, mathematics, and computer sciences. Genome bioinformatics is a field of bioinformatics that studies data related to the genome. Before we try to understand what kind of research scientists in this field are engaged in, let's first explain what the genome is and what kind of data can become an object of study.

Molecules called proteins are responsible for almost all processes in our organism. Let's imagine that the organism is a big company, there are a lot of tasks in this company, and the relevant specialists are responsible for specific tasks. In this case, the "specialists" are the proteins. One group of proteins is responsible for us to see, another group for us to speak, another group for metabolism in our organism, and so on. That is why different types of proteins are produced in our organism, each of which has its own responsibilities.

Bioinformaticians spend a lot of time in front of a computer

Probably you have heard about DNA. DNA is another important molecule in our cells, without which proteins would not be produced. DNA is a long chain, which is made up of separate segments - genes. Each of these genes is like a passport for a particular protein and contains information about the structure of that protein. In other words, for our organism to know how to create a particular protein, it must first “read” what is written in the corresponding gene about the structure of that protein. Together, genes and other segments of the DNA strand make up our genome, hence the name genome bioinformatics.

And how is a protein created based on the information in DNA? Since DNA is in the nucleus of the cell, and proteins are produced in another part of the cell, it is necessary to "copy" or "transcribe" the information about proteins in the genes and bring it to the appropriate part of the cell, where the proteins will be produced.

This is where molecules called messenger RNA or mRNA come to the rescue. Through these molecules, the information about the protein structure in the gene is "transcribed" and transferred to the appropriate part of the cell, where the synthesis of a particular protein begins.

It is already clear that proteins are responsible for almost everything in our organism. It is also important to note that when the amount or structure of these proteins changes, some consequences can occur. For example, if any protein is produced more than is needed, then it may cause some diseases. This is precisely why the study of the genome is important. But we are talking about enormous data. According to Siras Hakobyan every person, for example, has about 20 000 protein-coding genes. That is why bioinformaticians turn to the help of computer programs to process and analyze all these data and be able to draw certain conclusions based on them.

 

What Can We Find Out With Genome Bioinformatics?

Now let's understand how the study of genomic data can be helpful for scientists. For example, if scientists are trying to find the causes of some type of cancer, they will compare healthy cells and cancerous cells to understand what changes in the genome of the cancerous cells might have caused the disease.

According to Siras when scientists want to compare healthy cells, and cancerous cells they extract mRNAs from both and then measure their amounts. 

“If there are 100 mRNAs encoding a certain protein in a cell, it can be roughly assumed that about 100 of those proteins will be synthesized performing some function”, Sirais says.

When scientists compare the amount of these proteins between cancerous cells and healthy cells, they are able to understand which proteins are supposed to be increased, or decreased in cancerous cells. 

“[The change in the number of proteins] could, for example, lead to rapid cell division, which causes cancer,” says Siras (rapid and abnormal cell division is one of the causes of cancer).

However, Siras approached this question from a slightly different angle during a study he conducted with supervisors from Armenia and Germany.

As we have already stated earlier the information about the structure of proteins is found in genes. Sometimes based on the information in one gene, it is possible to synthesize proteins with different functions. The reason is that the gene itself is divided into sections. Some of these sections contain information about the proteins, and the other parts do not. 

When a certain protein-coding gene is transcribed, the parts that do not contain information about the protein are removed. As a result, only the segments containing information are joined together in mRNA. In some cases, certain sections containing information about a protein are also left out of mRNA. Consequently, different mRNA structures, known as isoforms, can be derived from a single gene. To get a better idea of the process, you can look at the picture.

The parts containing information about the protein are marked in red, the ones that do not contain are in blue

Siras says that an average of 7 mRNAs can be synthesized from one gene. Such structural changes can also affect the properties and functions of proteins because proteins are synthesized on the basis of mRNAs. That is why Siras and his colleagues focused on the study of isoforms.

Let's imagine a situation where a gene is expressed in the same amount in cancerous and healthy cells.

"But if we look more closely, we can see that in some cases, different isoforms of the same gene are produced, which can code for proteins with potentially different functions," Siras says.

In this study, Siras and his colleagues compared the isoforms of healthy cells and cancerous cells. They discovered cases where a certain gene had the same amount of mRNA in healthy and cancerous cells, but the isoforms of that gene were different.

According to Siras genome bioinformatics facilitates the work of biologists by analyzing such data because the latter can already focus on certain genes and proteins, carry out experiments, and find cause-and-effect relationships.

 

Not Only Quantitative Data is Important, But Also Relationships

This spring, Siras Hakobyan participated in one of the international conferences held at Cold Spring Harbor Laboratory. The studies of this institution are focused on cancer, genomics, and so on. During the conference, Siras presented his research in which he studied genomic data using an interesting method.

Do you remember about proteins? Let's get to know them a little more. These fascinating molecules do not work individually, but in cooperation, combining in functional groups. This looks like a big network where proteins are divided according to their roles. There are proteins that are more important and on which the work of other proteins depends, and there are proteins that are less important.

That is why Siras in his research focused not only on quantitative data but also tried to understand how the proteins would appear in their network, and how their large amounts would affect the overall work of cooperating proteins.

The peculiarity of his research is that he did single-cell analysis. Usually, when conducting comparisons, scientists take healthy tissues and cancer-affected tissues, and extract averaged genomic data from their cells. The reason is that the extraction of DNA or RNA from single cells is a complex process. During his research, Siras studied single-cell genomic data and calculated the activity of protein-protein interaction networks in single cells of different tissues.

According to Siras, these and other data obtained by bioinformatics methods usually can be verified both by biological experiments and by the results obtained by other researchers.

 

National Dances Parallel to Scientific Activities

In addition to scientific activities, Siras has been interested in national dances for more than ten years. He is a member of the Armenian "Lernapar" dance group and participates in concerts. Siras also has a passion for folk songs, which he often follows, looking for archives and researching history.


"Lernapar" dance group (from Siras Hakobyan's archive)

Of course, scientific activity requires a lot of time, but Siras definitely finds time for this hobby.

 

Anna Sahakyan
Siras Hakobyan contributed to the preparation of the article


If you have found a typo, you can send it to us by selecting the typo and pressing CTRL + Enter

read more


comment.count (0)

to comment