Huang Yanyi, Professor of Peking University: how does the new coronavirus laboratory carry out detection research?

 Huang Yanyi, Professor of Peking University: how does the new coronavirus laboratory carry out detection research?

Huang Yanyi: Hello, everyone. Im Huang Yanyi. I work in Peking University, and now I am a researcher and deputy director of Beijing future gene diagnosis Innovation Center; I am also a researcher of biomedical frontier innovation center of Peking University, and a professor of the school of chemistry of Peking University. I am mainly engaged in the research of life analytical chemistry, especially the analysis technology of micro nucleic acids, focusing on gene sequencing method, single cell analysis and microfluidic technology. At the invitation of future forum, Id like to talk about the detection of new coronavirus.

This is a picture I took in Peking University two hours before the New Year bell on December 31, 2019. The lake is frozen, and the reflection on the unnamed lake is the light on the Boya tower. On that day, I was busy and didnt read the news; later, I found that there was a news that there was pneumonia of unknown cause in Wuhan. When I know the news, everyone may know it. I dont use Weibo, so the sources are always slow. In the next few days, we slowly got a lot of news push, and I began to care about the news, although I didnt know whether the sudden situation would affect my life. About a week later, the cause of pneumonia has been determined. At this time, I began to pay more attention to it. Why? Novel coronavirus, although I dont know what the coronavirus is, but I have got the whole genome sequence of the virus, I am interested in it, because I am a research worker in sequencing research. I think its amazing when I see how to sequence viruses. Less than a week has passed since I saw this news and got the final result. Its a very unimaginable speed. In just a few days, we have identified new pathogens.

Think back to a pathogen identification process. If its a new infectious disease, what should we do? It has never been easy. At first, the news on the Internet said that it was SARS, which we often call SARS, and came back again. It can be found so quickly that it is not SARS, but a new virus. It is also close to SARS. Thanks to the rapid development of high-throughput DNA sequencing technology in the past 15 years or so. In general, to identify the pathogen of a disease, it is necessary to conform to the so-called Koch rule. That is to say, first of all, we should find the pathogenic microorganism in the host body and find that it does not exist in the healthy organism; we should try to isolate it, get its pure culture, re inoculate it into the host body, and then isolate the culture again, and find that the host inoculated by the culture will inevitably lead to the occurrence of disease, etc. Kochs law was officially published 130 years ago, and up to now, it still dominates the diagnostic criteria of most infectious disease pathogens. But with the development of science and technology, in the new era, with the rapid development of genomics, many laws used in the identification of new pathogens, there are new changes.

For example, for example, pneumonia, the symptom of which is usually poor oxygenation when the patients alveoli are filled with liquid. What pathogen causes pneumonia? It can be said that there are many. In addition to the most common Streptococcus pneumoniae, there are many pathogens that can cause pneumonia. For example, the pathogens shown in the figure below, which I found on the Internet, may cause pneumonia.

Generally speaking, it can be divided into three categories: virus, bacteria and a few fungi. The virus is the most troublesome, because it is very small. Virus has been with human for a long time. It has never been interrupted in the history of human civilization. It has always existed and is difficult to deal with. The virus has been on the earth longer than us. With the development of human civilization step by step, it will always follow us. Now, polio, which is gradually rare due to the popularity of vaccines, has taken more than half a century to basically eradicate. A few months ago, I saw the news of the World Health Organization, which announced the elimination of wild poliovirus type III worldwide. This is a subtype that has been completely eliminated. One of the reasons why viruses are difficult to deal with is that they are very small, and their scale is not matched with that of the naked eye; the naked eye can not see viruses, and there are too many magnitude differences. After the appearance of optical microscope, it greatly helped biologists and pathologists to find pathogens. It helped Koch identify the pathogens such as tuberculosis and Vibrio cholerae, but he could not use these tools to find the virus, which was beyond the limit of optical microscope. Until 1939 to 1940, Ruska, a German scientist, saw the virus particles under the electron microscope for the first time, including tobacco mosaic virus and cowpox virus. Thanks to his older brother, who was the inventor of electron microscope and later won the Nobel Prize. Thanks to the development of these new technologies, we have determined the new coronavirus.

Lets go back to January this year. In late January, several scientific papers written by Chinese scientists were published one after another, enabling us to fully understand this new pathogen. News reports are just news, but reading scientific papers can feel the importance of details. First, in the January 24 issue of the New England Journal of medicine, we saw the whole genome sequence of the virus. From the sequence, biologists can clearly determine that this is a coronavirus, and from the sequence analysis, they can determine that this is a new virus. The new virus, which has not been found before in human history, belongs to beta genus, and is the seventh member of the coronavirus family that infects human beings.

Above is a transmission electron microscope photo of the new coronavirus shown in the first scientific paper, and it can be seen that it is very small - compared with the scale, its size is only about 100nm, but its appearance is very similar to other coronaviruses.

On the same day, the Lancet published the results of the study on the clinical characteristics of the patients with the new crown by the team of Professor Wang Jianwei, Institute of etiology, Chinese Academy of medicine, and director Cao Bin, China Japan Friendship Hospital. This is the first time that a technique, RT-PCR, is used to detect nucleic acids when a patient is diagnosed. All kinds of signs show that this is a new disease, which brings us new challenges. It not only brings new challenges to science, medicine, but also patients.

The third important paper is published in the Chinese medical journal, which describes the characteristics of the new coronavirus. In the news of China Science Daily, we can see that the process of pathogen targeting is very rapid and efficient. Several teams started sequencing in parallel, which greatly increased the reliability of the results. In the study of serology, it also reveals the antigen antibody reaction between the virus and the patients serum in the recovery period, which is a very complete scientific discussion.

The fourth part is another team of the three parallel tests, the team of Wuhan virus Institute. Their results, like those of other teams before them, yielded the full sequence of several viruses.

Another important result is that these colored light microscopes on the right (above) show that human ACE2 protein is an important way for virus to invade human cells, which provides important clues for the research and treatment of viral pathogenesis and drug development.

The fifth paper was published at the same time as the previous one. The Shanghai team obtained samples from a single patient in early Wuhan. Through high-throughput sequencing, the full sequence of the virus was obtained and compared with the sequence of SARS virus. It was because of the rapid response and high quality work of Chinese scientists in the first week of January that the reference genome of the virus was quickly obtained. By referring to the genome, we have a coordinate system.

First, you can understand what each segment of the virus genome is for. For example, it can be seen from the above icons that each gene is used to encode a segment of functional protein to perform a specific function. Some are used to make the virus replicable, some are used to make the virus beneficial to invade the human body, and some are used to build the structural framework of the virus. Secondly, with this reference coordinate, we will know where the virus mutates, which is very important for us to have a deeper understanding of the basic functions of virus biology, its natural history, and the occurrence and development of diseases.

The main protein encoded by the virus genome is shown in the figure above, showing the predicted protein structure, which is the result of the team of Professor Zhang Yang from the University of Michigan. From here, we can see that there is a very important protein, called spike protein (s protein), and other proteins, such as M protein, N protein and so on. These different proteins are very specific expression proteins in this virus. Through these results, we can build a model map of the new coronavirus, which can often be seen in the news. On the surface of the virus, these proteins are generally the ones I mean. These proteins give the virus specific functions, not only give it a structural support, but also become our important target for us to detect. We can determine the presence of viruses by detecting specific components of these viruses, such as target proteins or genes that encode them.

It is because of our understanding of the sequence of the virus that we can immediately know that it is closely related to the former enemy SARS coronavirus 17 years ago. We dont need to tell you how to look at this picture. We can see that they are very similar. It is not only similar to SARS, but also related to mers 10 years ago. This also shows that we cant look down on this virus, it may have great destructive power. The novel coronavirus, 2019, is named SARS-CoV-2, a severe acute respiratory syndrome coronavirus 2, because of its similarity with SARS. Sars-cov-2 indicates that it is closely related to the original SARS virus. A recent paper by German scientists also showed that the new SARS coronavirus and the original SARS virus did use a very similar strategy when attacking cells, by attacking the same proteins to invade cells. It was also found that the antibodies of SARS virus had a certain neutralization effect on the new virus, revealing the high correlation between the two viruses.

In the era of biomolecule, the understanding of infectious diseases began to change from the molecular level. Knowing the coronavirus, its not like in 2003, when we couldnt determine the pathogen at the beginning, so we called it atypical pneumonia. Now the International Health Organization has named the new disease 2019 coronavirus disease (covid-19), that is, the disease caused by coronavirus infection, rather than simple symptoms. The novel coronavirus pneumonia is named after our National Health Council. So most of China novel coronavirus pneumonia is called COVID-19 in China.

Above is a coronavirus seen by scientists at the University of Hong Kong in January. The virus has been replicated and released from the cell.

The picture above is a large-scale picture of coronavirus taken by American scientists in February. You can clearly see the appearance of corolla on the surface. This is a fake color picture, which well marks the existence of corolla.

Molecular diagnosis is an important part of disease diagnosis. Disease diagnosis is a complex problem, especially when new diseases appear. For an infectious disease, molecular diagnosis becomes a very important indicator when the pathogen is known. Its main purposes are as follows: first, it determines whether clinical symptoms, such as fever, are caused by the new coronavirus, so as to avoid confusion with other diseases. Second, describe the process of the disease qualitatively (such as positive and negative) or quantitatively, such as the viral load. Third, from a more refined point of view, some molecular diagnosis can also provide a treatment plan for the disease, help determine its plan, and provide support for the evidence. Fourth, from the perspective of viral diseases, it is very important to track the variation. It is also very important to study and predict the evolution of the infection and pathogenicity of viruses in the process of transmission. For example, my colleague Lu Jian, a teacher from Peking University, pointed out that there are some mutation sites that are particularly closely related, which are likely to have a very close indicative connection with the recovery of the disease Therefore, it may be of great significance to the treatment of diseases in the future and to understand how the virus and the host live together.

What is molecular diagnosis? Generally speaking, the narrow sense of molecular diagnosis is nucleic acid detection, including the qualitative and quantitative analysis of specific nucleic acid segments, sequence analysis using nucleic acid hybridization chip, sequencing technology, etc. If we want to expand to a broad molecular diagnosis, we may also be able to cover other molecular pathological techniques corresponding to traditional pathological diagnosis, including antibody detection. But as a rule, generally speaking, molecular diagnosis means nucleic acid detection. Nucleic acid detection refers to the detection of whether there is a specific sequence of DNA or RNA fragments in the sample as evidence of the existence of pathogenic microorganisms. In addition to the molecular diagnosis of pathogenic microorganisms, this method has been widely used before the epidemic, such as the diagnosis of a large number of other infectious diseases, the diagnosis of genetic diseases, the classification of tumors, the guidance of chemotherapy drugs and many other medical practices, but at that time, there were not so many as mentioned today.

In addition, we do antibody detection for antibodies. Antibody detection depends not on the virus itself, but on the presence of antibody molecules in the middle of blood or body fluids.

Another test is antigen detection. The target is the substances on the virus. For example, the two places we see in the blue mark above are the unique proteins of the virus, spike s protein and nucleocapsid N protein. To see whether they exist in the sample is antigen detection. This is all based on the material basis when we want to test.

Detection, in the final analysis, is to measure the number of molecules, which is determined by the structure and composition of the virus itself, and has its material basis. These Israeli and American scientists have listed some key data of coronavirus. In terms of detection, in addition to the nucleic acid molecules (marked by the green arrow in the figure above), which are one molecule in each virus, there are also many protein molecules (marked by several red arrows), in a large number of viruses. But the number distribution of these molecules is different. There are about hundreds of S proteins per virus, and thousands of N proteins and M proteins per virus. The understanding of these numbers is very helpful for us to develop specific detection techniques to determine the existence of the virus in the future.

It can be seen from the seventh edition of the new coronavirus pneumonia diagnosis and treatment plan that the diagnostic indicators are very clear. There are two evidences on pathogens, one is fluorescent RT-PCR, the other is gene sequencing. The nucleic acid sequence characteristics of viruses are quite different from those of other species, such as animals and humans. Through these two methods, sequence identification is not easy to make mistakes. With the development of the epidemic situation and the deepening of our understanding of the natural history of the virus, when the diagnosis and treatment plan is adjusted to the later stage, the serological test is also added, and we will talk about it later.

With regard to detection, the notification of the national health and Health Commission and the approval of the drug administration have always been around nucleic acids and antibodies. The two detection methods are totally different, the purpose is different, the source of samples is different, the detection method is different, the interpretation of results is different, and the scope of application is also different. Lets talk about these two methods.

The first step in nucleic acid detection is sampling.

Now there are more people who have been tested for nucleic acid and more news. Its not too fresh to see the picture above. In fact, I havent experienced how swallowing swabs work before last year. More people have only seen this kind of laboratory environment in the pictures, and have not really experienced it. The real process can be a little more complicated than the picture shows. Sampling is generally in a relatively open place outside the laboratory. The sampling personnel must be trained professionally and pay attention to protection. Because this is a risky operation, not only for those who may have infectious ability, but also for those who will increase the risk of exposure during the sampling process. For example, sneezing or coughing during the sampling will increase the risk of exposure.

Different sampling sites reflect different test results. Chinese scientists have summed up some rules for a long time.

For example, from figure a above, it can be seen that with the development of symptoms, the signal slowly seems to go down. With the increase of time (after symptoms), the number of nucleic acids obtained by sampling is less and less. Another result is that the amount of nucleic acid collected from the nasopharynx swab is higher than that from the laryngopharynx swab, and the stability is slightly better. The experience gained in the early stage is of great significance to the later process of guiding sampling, that is, how to collect and where to collect samples.

The characteristics of the disease lead to the characteristics of the virus infection site, both preference, but also affected by the time of onset. In fact, as a test, you can use a variety of samples, not just two swabs. In the seventh edition of the diagnosis and treatment plan, it is clearly proposed that viral nucleic acid can be detected in the samples of nasopharynx swab, sputum, other lower respiratory secretion, blood and feces. There are also investigations and studies that show that, in conclusion, it seems that nucleic acid virus may be detected in many samples, such as tears and feces, but the sputum or lower respiratory tract secretions are the least likely to be missed, that is to say, the detection rate is the highest, and then the nasopharynx swab. Therefore, it can be found that in the real high-quality measurement research, the source of the sample is very meticulous, and it is required to ensure the quality of the sample sampling, and only good samples can get good results. The World Health Organization has guidance documents in January, which have well combed the collection, preservation and use of different samples. With such guidance, researchers and medical personnel around the world can follow a relatively regular and reliable process to ensure the reliability of the results.

After sample collection, the first step is usually nucleic acid extraction, which is to leave the nucleic acid part of the sample, remove other impurities, human exfoliated cells, other molecules such as protein molecules and so on. Generally, nucleic acid extraction uses materials with strong affinity with nucleic acid molecules, such as silica gel, etc., which are adsorbed first; after other substances are separated, the adsorbed nucleic acid molecules are eluted to achieve extraction, purification and enrichment at the same time. There are two methods that can be used here in the laboratory. These two methods are widely used in the nucleic acid detection of new coronavirus. Either the silica gel membrane or the magnetic particles modified on the surface can be used for adsorption. This step seems simple, but the key point here is to consider biosafety. Interested audiences can watch the training and teaching video of Peking Union Medical College Hospital, Chinese Academy of Medical Sciences. This video completely demonstrates how to extract nucleic acid, and also emphasizes the biological safety of operators. In order to ensure this, it is particularly pointed out that all samples should be placed in a 56 degree water bath for 30 minutes before operation to inactivate the virus, so that the virus can not be infected. The novel coronavirus laboratory biosafety guidelines issued by the National Health Protection Committee also said that the operation of uncultured infectious materials needs to be done, and should be maintained in the laboratory of biosecurity two level standards, while the use of personal protection in biosafety three-level laboratory.

Above is a cartoon of CDC in the United States. The third level of biosafety is simply to use negative pressure in the closed space. All air and other substances will not be directly discharged to the outside without treatment, which is a biological experimental environment. The people in it are not much different from the fully armed white soldiers we often see on TV. They are all dressed in the same clothes, which will only be more strict and require no exposed body parts at all. The secondary requirements of biology are not so strict. There is exchange between air and the outside world. Sometimes, the biosafety level can be increased in some areas, and more demanding experiments can be done. This time, Wu Chen led the Wuhan shelter inspection national team to work in a mobile Biosafety Level 3 laboratory.

The picture above shows Professor Wu Chen working in Wuhan. The working environment on the left is biosafety level II, and the car on the right is the mobile biosafety level III laboratory. Wu Chens dress is the standard biosafety level III dress. If there is no name written there, you dont know who she is. It is for the sake of Biosafety to do experiments with full armed forces.

Nucleic acid extraction is followed by amplification, which is achieved by several tubes of reagents provided by the manufacturer. In the picture above, on the left is Huada genes reagent, which contains several small test tubes. On the right is the reagent distributed by the CDC, which also contains several small test tubes. The detection process is called RT-PCR, which is reverse transcription polymerase chain reaction. This is an operation of DNA replication in vitro by using the nucleic acid replication mechanism existing in nature. It has been invented for nearly 40 years. It is a basic technology of molecular diagnosis and an important foundation technology for the progress of modern biology and medicine. In each test tube, in addition to the sample to be tested is the nucleic acid extract, DNA polymerase, reverse transcriptase, as well as primers and probes are added. The new coronavirus is an RNA virus. It wraps this genomic RNA in the virus, transforms this RNA into a complementary DNA through reverse transcriptase, and then replicates it again to become a double stranded DNA. Later, it can uncouple the paired double stranded DNA again and again through thermal cycle, and replicate separately. After N cycles, we can theoretically get the specific DNA fragments with geometric progression growth. In the process of replication, a clever design enables each replication to produce a specific fluorescent molecule. The more molecules that are replicated, the more fluorescence they generate, we can detect this signal. In such a round of replication, we can see the fluorescence growth. If there is fluorescence growth, it is called positive; if there is no growth, it is called negative. The faster the fluorescence increases, the more molecules, the slower the growth, and the less molecules. This is probably the process.

This process doesnt take long. You can give a graph like this (as shown in the figure above), cut a line, cross with the fluorescence intensity curve, if the value is low, you may be positive. If the value is too high or there is no (rising) evidence, you may be negative.

In fact, this judgment is still particular. Quality control materials must be added in the process of doing this. Both negative quality control and positive quality control must be included. Generally speaking, the negative is pure water, in which the nucleic acid that should not be tested should not give a signal; if there is a signal, the reliability of this experiment is questionable. The positive sample is usually a low concentration sample. If we do not detect the signal, it proves that this experiment is not sensitive enough. Maybe there is something wrong with it, which needs to be checked and redone.

In addition, the selection of probe sequence, i.e. primer sequence, is very particular. In the sequence of about 30000 bases, it will take some effort to detect which one. A group of South Korean scientists recently published a comparative result, comparing the effectiveness of the nucleic acid detection sites used in January. There is a difference in the diagnostic sites determined by scientists from different countries. Here, I have marked the positions of these detected nucleic acid fragments in the genome, for example, two sites in China, one is orf1ab, the other is n; three in the United States, are all on the N protein gene; one in Germany, is on the RdRp gene, and the other is on the E gene.

The image above is a sequence of N gene, in which the color indicates the location of primers used for the detection of sites in various countries. It can be seen that there are some small overlaps in a small number of positions determined by scientists from different countries, but in general, they are not exactly together. The results are interesting. According to the comparative experiment of South Korean scientists, the sites selected by China and the United States are the most sensitive among these tests. This also reflects that our scientists select the sites accurately in the early stage, which requires experience, ability and scientific judgment.

Can nucleic acid detection be improved? First of all, speed is a standard nucleic acid detection process. Sampling is very fast, and then it is stored in the delivery medium. When the samples are stored in a certain amount, they are inactivated intensively, 56 degrees and 30 minutes, and then nucleic acid extraction is carried out; nucleic acid extraction is about 30-90 minutes, and PCR reaction is 40-60 minutes. This can be done in a batch of samples, often 96 together, and it can be done quickly after coming out According to the results, the whole experiment is about half a day. Can this process be accelerated? If it is a single sample, it can be done directly after sample collection. Many steps can be skipped, such as putting it directly into the lysate, and then doing rapid nucleic acid extraction, or even doing rapid PCR without extraction, so that the result may be obtained in about 30 minutes. It can be seen that PCR detection itself can be fast, especially when the sample quantity is small, but when the sample quantity is large, it is better to operate batch by batch more efficiently.

I will talk about some detection methods later. Lets take a look at a review of American scientists a few days ago, and make a summary of the various nucleic acid detection methods recently reported.

As can be seen from the above figure, most of the time required for various methods can be completed in an hour or two, with the price ranging from a few dollars to a dozen dollars. There are many methods here, some of which are no longer the so-called RT-PCR methods, but they still make full use of the power of nature, use the replication mechanism of natural genetic material, carry out specific DNA fragment amplification, and finally achieve strong enough signal acquisition. Amplification itself can be achieved by various ingenious methods. In addition to RT-PCR, which has been repeatedly verified and optimized for more than 20 years and has been proved to be a very reliable clinical test method, there are many other amplification methods that have been invented. In the face of the detection of new coronavirus, these methods are undergoing strict tests. Based on the pursuit of detection speed, even through RT-PCR, in order to reduce the technical barriers and errors in operation, the so-called POCT equipment can be used, that is, small instruments and devices that can be used on site.

The two instruments in the above figure are the first to be approved for emergency use by FDA in late March. They can complete the whole process from sample to result in 30 minutes or 45 minutes respectively. There are at least two important benefits of using such equipment during the epidemic of infectious diseases. One is to decentralize the needs of patients for testing and avoid sample backlog and delivery difficulties. The second is to use the almost closed laboratory design to reduce the biosafety risk of operators.

For example, the above paper, which was participated by researchers of Shenzhen Futian hospital in 2014, is to evaluate the performance of this instrument in detecting influenza. The basic technology of this instrument is a relatively cold amplification method, called near. There are many strange enzymes in nature, which can do many things. Incision enzyme is to form a small incision on the section of DNA, and then let the polymerase find the place where it can start to replicate, constantly form incision, and constantly replicate. This is a very ingenious method, which does not need PCR reaction and thermal cycle, and belongs to the isothermal amplification. Isothermal amplification means that there is no need for thermal cycle, and the amplification process is carried out at a constant temperature. The advantage is that this process is very simple; the other advantage is fast, because the detection of isothermal amplification is often completed in a time scale of several minutes or more than ten minutes. In addition to near, there are many isothermal amplification reactions that can be used for nucleic acid detection. For example, a relatively well-known lamp ring mediated isothermal amplification was invented by Japanese scientist Notomi et al. In 2000. This method is also used to add four or six primer sequences to the substance to be tested, to raise its temperature a little bit, and usually do the isothermal amplification reaction at the temperature of 60 degrees left and right. It doesnt need a very complex instrument to finish it quickly. One of the advantages of this method is that the equipment can be done simply, can be easily operated, and can be distinguished by the naked eye. China food and drug administration also approved in vitro diagnostic reagents based on such reactions, for example, this is a new coronavirus diagnostic equipment based on lamp amplification made by Chengdu Boao crystal core.

There are other methods, one of which is RPA, which is relatively easy. It uses the combination of two enzymes, recombinase and polymerase together, through a very complex cycle, to achieve the amplification of nucleic acid molecules, although complex but very efficient, very fast. Based on RPA, Zhang Feng, a famous young Chinese scientist in Boston, United it with gene editor and grafted it to realize RNA detection. They set up a company named Sherlock and registered it with this name to develop this reagent. Coincidentally, in San Francisco, California, U.S., Jennifer Donna, based on the lamp method, grafted gene editor, and her collaborators also made a new method, called detetr, and founded a company called mammothbiosciences. Compared with the two methods, both of them can achieve 30 to 50 minute detection, which is relatively fast. These two methods can be used to read the results in the form of test strips, which are easy to understand and do not need complex instruments.

In addition to shortening the time, it is very important to improve the detection flux for large-scale detection. A kit can be used to do many tests, but the samples need to be tested one by one. In addition, nucleic acid extraction, which used to consume physical strength and require experience, skills and training, is better to be operated by an indefatigable and error free machine. So we see a large number of automatic instruments for nucleic acid extraction appear in the detection site, such instruments are all over the designated hospitals and disease control centers at all levels.

The two instruments on the right side of the figure above are in university laboratories in the United States. Now university laboratories have joined the large-scale test queue, which can eliminate some anxiety of test lag. If it is a large-scale testing structure with centralized processing, it is necessary to use more complete and automatic instruments.

The picture above is a giant instrument made by Roche company, cobas8800. A machine like this can complete about 3000 nucleic acid tests without shutting down in a day, and it is almost fully automatic. After putting the samples in, it will start to do it. What are the benefits of this? In addition to being fast, there is also an important advantage to be able to ensure that results are basically reliable and stable in different places around the world. In this way, where the foundation is relatively weak and the accumulation is not strong enough, high-quality large-scale detection can also be achieved. In Kenya, Canada, the United States and China, there will be no big deviation due to operation problems.

With the progress of the epidemic, more and more testing needs have emerged. With home isolation becoming the norm, can self-test and self sampling be feasible? The first home self-service sampling kit approved by FDA in the United States is the product pixel of labcorp, which uses a cotton swab to wipe it in the nasal cavity, and then sends it back to the testing center for large-scale testing, so it does not need to do very complex sampling. At present, there is no large-scale evaluation data. But after all, its not easy to stab nose with a cotton swab, so more than a month ago, researchers in West China hospital proposed the possibility of using saliva for new crown diagnosis. This is a relatively easy operation of self sampling. Less than a month ago, scientists from Yale University published a paper comparing the results of PCR nucleic acid detection in saliva and nasopharynx swabs. The results showed that the stability of saliva sampling was better than that of nasopharynx swab. Whats more interesting is the results seen in the vast majority of patients. There are more viruses detected in saliva, and the results may be more accurate, reducing the occurrence of false negative. This is a good starting point, so a few days ago, Rutgers University in New Jersey and their subordinate testing agencies got approval from the FDA to start using saliva samples for the detection of new coronavirus. They used a large-scale pipeline processing automation machine, which can detect 10000 samples every day. Saliva sampling is a very encouraging new method. If it is popularized, it is expected to solve the problems we faced in the past.

With the first peak of the epidemic past, the stage of returning to work and school comes, antibody detection has become a new buzzword. How is antibody testing carried out? I found that the previous nine issues are all related to these issues, and you must know more about antibody than I do. So Im going to talk about how antibody detection is done technically.

First of all, the samples are very different. They are all tested with blood or serum. There are many methods of antibody detection, the most common are enzyme-linked immunosorbent assay and immunochemiluminescence. The principle of these two methods is very similar, and they are also the most common detection methods in laboratory research. Taking the principle of enzyme-linked immunosorbent assay as an example, we first lay a layer of artificially prepared antigen on the detection base, then add the serum sample, after incubation for a period of time, the specificity is bound to the antibody; then we elute it, leaving the specific binding antibody; at this time, add the enzyme labeled second antibody, the second antibody can recognize the immune antibody, and it also has catalytic effect Enzyme; then incubate for a period of time, leaving specific secondary antibody after elution, adding a substrate will produce enzyme catalysis, which will lead to the occurrence of color reaction, making the color of the solution change. Finally, we use the concentration of the substance that produces the color reaction to deduce the concentration of the antibody you want to test. The principle of this method is simple, and the specificity is guaranteed by antibody recognition. It does not need special complex operations. It can be automated or semi-automatic, and the experimental flux is not low. From these holes, the color of the deep light, you can calculate the number of antibodies. Therefore, it is a quantitative method and also has the ability of large-scale screening. The chemiluminescence method used in the hospital is similar to this principle, but the detection signal is not the same.

Another method is to use colloidal gold to develop color to detect the test paper. The same method is used to detect pregnant test paper. The advantage of this method is very simple, easy to use, cheap, and does not need complex equipment, very fast.

The above figure shows the structure of the test strip: sample pad, colloidal gold binding pad, cellulose acetate film and absorption pad. If the samples to be tested include the antibodies you want to test, such as IgM or IgG of the new coronavirus, send the samples to the test paper. At this time, there are gold labeled antigen and gold labeled monoclonal antibody on the gold binding pad, that is, the gold combined antigen and monoclonal antibody flow together to the direction of the quality control line. The gold labeled monoclonal antibody will be combined to the antibody of the gold labeled monoclonal antibody as the result of the quality control. If there are two antibodies to be tested in this sample, it will be added to these two detection lines. At the same time, because it can adsorb gold labeled antigen, it will develop color. If so, we know if there are antibodies to be tested in the sample. If the line of IgG is very deep, we know that there is IgG; but the line of IgM is very light, we can think that there may not be this antibody, and this result can be easily understood.

The reason why colloidal gold color detection should be done is that this experiment can be applied to large-scale screening. Many places need to investigate the infection rate. If the infection rate is high and most people have antibodies, the virus is not easy to spread. If the infection rate is very low and most people do not have antibodies, the virus can easily spread, which is to better prevent the next epidemic.

Can you get the result you want to know after testing? This requires a lot of serious thinking, because no way is perfect. Molecular detection is never perfect. There is no perfect detection method in the world. To understand the advantages and disadvantages of each detection method, in order to better select and judge.

Above is a schematic diagram published in the Journal of the American Medical Association a few days ago. The relationship between different test subjects and disease progression is very different. The virus is easy to isolate a few days before the onset of the disease, when the nucleic acid load is not much. Can nucleic acid detection be detected? Throat swabs or alveolar lavage fluid are easier to detect than faeces, but faeces may exist for a longer time and be more easily detected later. Antibodies come out only in the late stage of the disease. IgM antibody decays faster, IgG antibody decays slower and stays longer. So what kind of detection method to choose has a very important correlation with the time node, and each method has a limit. In the analysis, we often use the detection limit to express the sensitivity of the detection method. The detection limit is the minimum concentration or quantity of the components to be tested from the samples under the given confidence condition. For example, the detection limit of gene editing method is one order of magnitude higher than that of traditional RT-PCR method. In fact, the detection limit of RT-PCR which is widely used now is very low, that is to say, the sensitivity of detection is very high, and it can detect the presence of several nucleic acid fragments in one reaction. Most other detection methods, fast or slow, are not necessarily more sensitive.

In addition, consider whether the specificity of a detection method is really good. The specificity of nucleic acid detection depends on the location of primer selection, the matching of primer sequence and target nucleic acid, and whether it can be detected. According to the research paper published in clinical chemistry by scientists from Hong Kong, one of the two nucleic acid detection sites is significantly easier to detect than the other. Specifically, the N protein gene is easier to detect than ORF1b. Although at the beginning of design, it seems that the performance of the two sites are similar, there are differences when they are actually used.

Although the method is not perfect, but it also has some advantages, as long as it is used properly, find the applicable scene. For example, in the crowd of 35 healthy people and 35 infected people, I now do a test, and the test results are 35 positive. You think this is a very good result, right? But this is not the case. The 35 patients detected were not really infected. The results can be divided into four parts: the real infected person is detected, which is true positive; the healthy person is regarded as the infected person, which is false positive; the healthy person is detected as healthy, which is true negative; the infected person is not detected, which is false negative. So the true positive, false positive, true negative and false negative together constitute the total number of test samples.

How to evaluate? The first evaluation index: sensitivity. How many positive samples are detected? The numerator is true positive, the denominator is the sum of true positive and false negative. Results there were 25 true positive and 10 false negative, so the sensitivity was 71%.

Another indicator: specificity, that is, how many true negatives are in negative samples. True negative and false positive are both denominators, while the molecule is true negative. The result is 25 true negatives, 10 false positives 71.4% u3002 Accuracy is the sum of the number of true positive and true negative divided by the total number, which is the same 71.4% u3002 On many occasions, we are concerned about whether the positive result is really positive? This is the so-called positive predictive value (PPV), that is, when you see the test results, true positive and false positive are both positive, but how many of them are true positive? In this case, 25 are true positive, 10 are false positive, and so is PPV 71.4% u3002 That is to say, there are about 7 true positives in 10 people, and the other 3 are false positives.

Just now I gave a very strange example. We can see that many values are the same. One reason is that the false positive and false negative in the example are the same. Among the real detection methods, the specificity and sensitivity are not necessarily the same. For example, if there is such a method, it is also the same crowd test. We find that all negative samples are negative, but five of the positive samples are negative. At this time, it was found that its specificity was 100%, that is to say, there was no false positive. Its sensitivity is 86%, accuracy is 93%, PPV positive predictive value is 100%. Although there is a false negative in this test, there is no false positive in it, so the detected positive must be positive, and the positive predictive value is 100%. Is there any way to approach it? Yes. For example, the PCR method we use, if done well, can make a very high positive predictive value. This is find, a non-profit organization in Geneva. It has made an objective evaluation on some kits on the market. We can see the kits produced in different places and detect different gene loci. Although the detection limit is high or low, it has done very well in general. The specificity index is very high, many of which are 100%. If there is such a method, you will know that when you get the test results, at least the positive predictive value is close to 100%.

The examples just mentioned are half positive and half negative. This example is a little extreme in the case of new coronavirus disease at this stage, and is not applicable. Why? Because the positive predictive value is related to the infection rate of infectious diseases. For example, if the sensitivity of detection method is 95% and the specificity is 95% in a population of 500 people, and the infection rate of this population is 5%, what is the result? There should be 25 infected people and 475 uninfected people. But 452 negative results were detected, including 1 false negative. This result is good. It means that if you test negative results, the probability of you being really negative is 99.8% , which is quite good. However, the positive result is not optimistic. Originally there were only 25 positive samples, but 48 positive results were detected, 24 of which were true and 24 were false. So the probability of true positivity is 50%, that is to say, if your result is positive, half of you may not have antibodies in your body, you cant go out and wander around, or its better to stay at home.

What we just said is that the infection rate is 5%. If the infection rate reaches 25%, what will be the result? There should be 125 infected people in about 500 people, 6 false negative and 19 false positive in the test results. If the result is negative, about 98.3% The probability is really negative, and not bad. If it is a positive result, the true positive probability is 86%, which means that you still have a big possibility. There is no antibody in your body. Do you want to go out and wander?

The example just mentioned is based on the assumption that the reagent for antibody detection performs well, that is, the sensitivity and specificity of the reagent are 95% and 95%, which is not so good.

Lets imagine if these two indicators are not so high. Suppose there is a big city with an infection rate of 1%. We take an antibody test for 1000 people. The sensitivity and specificity of the reagents used are 90% and 80%, which are in line with the real indicators of many reagents. What will be the result? Among 1000 people, because the infection rate is 1%, there should be 10 people with antibodies, and 990 people without antibodies. Of the 10, 9 were true positive and 1 false negative because the sensitivity of the test reagent was 90%. In the absence of antibodies, because the specificity of the test is 80%, 792 of 990 people were tested negative, but 198 people showed false positive. So, after the test results came out, 1000 people were tested, 207 of them were positive and 793 were negative. This shows that if a person detects a positive result, in fact, only 4% of the probability is true positive, and 96% of the probability is not positive. How to change this state? The fundamental solution is to make the reagent better and better, for example, to increase some specificity. Assuming the specificity increases to 90%, the result changes, with false positives reduced to 99. But this result, the positive prediction rate only increased from 4% to 8%, and more than 90% of the positive results were false positive. Assuming the specificity increases to 99%, the positive prediction rate can increase to about 47%. 99% specificity is a big problem for antibody detection, because it is already a very high indicator. This is also why there are many problems and risks in large-scale screening when the infection rate is low. We must keep the basic concept of statistics firmly in mind.

A few days ago, a group of scientists from California United to organize a mutual aid cooperative. They compared different antibody detection methods horizontally, and carried out 12 methods, 10 of which were test strips and 2 of which were enzyme adsorption. Many of the test strips are made by Chinese manufacturers and American manufacturers. As a result, they look almost the same, and none of them is unique. So its not that everyones production capacity is very different, but the challenge of this method itself is universal.

As can be seen from the above figure, the sensitivity is not particularly good in the early stage. It is likely that the detection rate is low due to the low concentration of antibody itself during the detection. The sensitivity increased slowly, about 80% or 90% in the later stage. There are great differences in specificity, some of which are quite good, some of which are very good, some of which can be considered comprehensively. Most manufacturers are about 80% and 90%, and some methods are 100% in small sample comparison. So in antibody test, especially in colloidal gold test, false positive rate is a big problem.

How to do the specificity test is to use the new crown nucleic acid negative samples. Novel coronavirus pneumonia is not available for quality control when it is found in the samples of the current large-scale epidemic. It is only possible to use the cross test before the 2018 July test to see if it will be positive. They made a very strict comparison and the result was very interesting.

It can be found that these methods are quite good, their method stability is relatively good, but up to now, no method really has the same stability and excellent specificity as nucleic acid detection, and we cant fully believe its results, so the false-positive problem will always haunt us.

Finally, take a moment to talk about whether there are other detection methods beyond nucleic acid detection, such as sequencing. Sequencing is a testing method that has been put on the agenda for the past few years. In the past few months, we have just had a method basically formed. In January, I, Wang Jianbin of Tsinghua University and Professor Xie Xiaoliang of our center published a paper, which was mainly completed by DILIN and Fu Yusi. In this paper, we report a new and simple method of RNA sequencing, named shrery. After the samples are put in, the three-step method is used to form sequencing, which greatly reduces the difficulty of the previous experiments which require very high experimental skills, and can make the experimental results more stable, reliable and fast. When the new crown virus is rampant, the hospital is overloaded, and the researchers in the hospital do not have much time to think about the complex operation process, so the simpler the operation process, the more conducive to the clinical new crown sequencing service. In the past few months, although we cant return to work in an all-round way, we have the honor to work with the partners of Beijing Ditan hospital to develop a new method called Minerva on the basis of sherry. The starting point of this method is to really consider from the reality and how to serve the clinic. This method doesnt need to select samples. Throat swabs, sputum and feces can all be used. After extraction, the first step is to build a Shire Library. After enrichment and other steps, a new library will be used to sequence the whole genome of the virus.

Through this sequencing, we have studied the nucleic acid sequences of more than 80 samples, from which we can see a lot of information that can not be obtained by simple nucleic acid PCR detection before. We have seen interesting phenomena, for example, many samples may have different mutations in specific sites. This is a brand new virus. The deeper we know about this virus, the better it will be for us to prevent and control the epidemic and prepare for possible threats in the future. In addition, we dont know how the mutation of the virus happened. What is the relationship between mutation and disease development? Will it affect the research of molecular detection reagents in the future? Is its accuracy guaranteed? Does mutation affect vaccine development? Does it affect drug experiments? We dont know that, so we need more tools to study it.

Finally, I would like to thank Professor Xie Xiaoliang of Peking University for his guidance. For details of many testing experiments, I consulted researcher Ren Lili of the Chinese Academy of Medical Sciences, Professor Wu Chen of the Chinese Academy of Medical Sciences, who works on the front line, and Professor Wang Jianbin of Tsinghua University, who is a long-term partner. Through this outbreak, we worked with Chen Chens research team in Ditan hospital to obtain a large number of genomic data, which laid the foundation for future research. And thanks to the students and staff who have been working hard during the outbreak, who have made great efforts to enable us to achieve this result. My work is mainly in Beijing future gene diagnosis innovation center and biomedical frontier innovation center of Peking University supported by Beijing Municipal Education Commission, both of which are interdisciplinary research bases led by Professor Xie Xiaoliang. We focus on the analysis and sequencing of micro nucleic acid samples, hoping to bring more results to you in the near future. Thank you. (Note: the PPT picture of this article is provided by the speaker and cannot be used without permission) special thanks for the arrangement of the manuscript: Guo Lijie, Li Min, Ph.D. student, Institute of Biophysics, Chinese Academy of Sciences, Ph.D. student, Institute of Biophysics, Chinese Academy of Sciences. Source: Zhang Zutao, editor in charge of Netease Technology Report_ NT5054

Finally, I would like to thank Professor Xie Xiaoliang of Peking University for his guidance. For details of many testing experiments, I consulted researcher Ren Lili of the Chinese Academy of Medical Sciences, Professor Wu Chen of the Chinese Academy of Medical Sciences, who works on the front line, and Professor Wang Jianbin of Tsinghua University, who is a long-term partner. Through this outbreak, we worked with Chen Chens research team in Ditan hospital to obtain a large number of genomic data, which laid the foundation for future research. And thanks to the students and staff who have been working hard during the outbreak, who have made great efforts to enable us to achieve this result. My work is mainly in Beijing future gene diagnosis innovation center and biomedical frontier innovation center of Peking University supported by Beijing Municipal Education Commission, both of which are interdisciplinary research bases led by Professor Xie Xiaoliang. We focus on the analysis and sequencing of micro nucleic acid samples, hoping to bring more results to you in the near future. Thank you.

(Note: This ppt picture is provided by the speaker and cannot be used without permission.)

Guo Lijie, PhD, Institute of Biophysics, Chinese Academy of Sciences

Li Min, PhD, Institute of Biophysics, Chinese Academy of Sciences