We are short with whom to learn, because we think it is almost outright fraud.
In addition, a former learning manager confirmed our analysis and explained the details of learning extensive robot operations from whom.
Based on almost all of the false users, we assume that the fraud part of the revenue from those who learn from is at least equal to the percentage of the fraudulent users. If the ASP of the real part of the business from those who learn from is also fraudulently exaggerated, we will not be surprised.
We have come to the conclusion that whoever we learn from has a huge loss. Because without users, there is no revenue. We have also come to the conclusion that we have learned from others to understate the cost.
Chen Xiangdong, chairman of the board of directors, made the stock he learned from more dangerous for the Bulls: he mortgaged at least $318 million of stock. The risk for investors who hold stocks with whom they learn for a long time is that the margin lenders will sell the stocks in a large scale, resulting in a sharp drop in the stock price.
We are very confident that at least 73.2% of the 54065 users analyzed are robots, and probably at least 80.8%.
Last month, chairman Chen curiously tried to dissuade us from learning from anyone. In an interview with Chinese media on April 8, he said: I think if he carefully analyzes our data, I think muddy water will not be so stupid. The level and IQ of muddy water people are quite high.
This is obviously bluff.
Robot search and recognition
We analyzed 463217 login records in the first half of 2020 (more than 54065 users of who to learn and gaotu classroom, covering more than 200 paid K-12 courses). The research results show that we have identified three robot modes that we are highly confident of, and they account for 73.2% of all unique users. If the fourth robot mode is added, the proportion can reach 80.8%. If some hypothetical conditions are changed to be unfavorable to the company, the proportion of robots will be close to 90%. These fake users are obviously controlled by the teachers and mentors who learn from them, as well as the third party.
By combining the four robot modes, we think that the user is actually a robot. We have confirmed our observation with the manager of who and provided further details about who and how to implement user fraud. We call these four types of robot users: precise joiners, burst joiners, gsxipjoiners, and early joiners.
In the data of 54065 independent users analyzed, we found that 5742 users (10.6%) of the login records were consistent with this precise connection phenomenon. It should also be noted that all of these precise participants exhibit at least one of the other robot behaviors we discussed, which strongly demonstrates this approach to robot search.
Although most of the precise participants recorded only one precise connection (log in two different occasions in the same second), our data found that 1261 (21.6%) unique precise participants performed this feat in two or more occasions, one of which completed nine precise connections.
Then, we combined these precise participants with another 33145 users sharing the same IP and users joining the class at exactly the same time (usually as part of the explosive participants). After subtracting 10342 repetitions, we thought that 28545 users (52.8%) were false.
After adding who to learn from IP participants, the total number of high confidence robot users reached 34534, accounting for 63.9%. Who to learn IP participants claim to be the only student user, but they can also be teachers or students, or users associated with them, because they share the same IP address. It is impossible for students to share IP with teachers or tutors because they no longer operate offline physical schools or learning centers. However, we found that 15239 student users (28.2%) shared IP with teachers or tutors at least once. Who to learn from the manager confirmed that some teachers or mentors learn from the operator robot network. Almost two-thirds of those who learn from IP are also precise participants, which reinforces our conclusions.
In addition, there are 1364 independent users associated with those who learn IP, and the total number of independent users identified by this way is 16603, accounting for 30.7%. Another 1364 users are associated with 15239 student users through the shared IP.
To reinforce our conclusion, 62.8% of the explosive participants showed at least another highly confident robot behavior. We are very confident that when these explosive participants enter five minutes before and after the start of the course, it means that a group of robots suddenly log in to enter the course. We believe that a five minute deadline is good for the company, and the real number of robots that are suddenly added in the excluded time period may be huge.
The explosive participant mode of a given course can be represented by a graph. The y-axis on the graph is time, and the unit is seconds. The x-axis represents each unique user. This graph shows a long horizontal line when explosive participant phenomenon occurs. The figure below shows the joining mode of a paid math class in senior primary school that has been running on the learning platform with whom for several months. The following pattern is consistent with the course pattern in our dataset. (note that the horizontal white line represents the start of the course. )
At breakout point 1, 104 unique users joined in four seconds, nine minutes and 40 seconds before the start of the course. There are six precise participants in outbreak point 1.
Between five seconds before class and three seconds after class, 648 users (including 37 precise participants) joined in. So we didnt count two-thirds of these users as robots because they didnt join as precise participants in the same second.
Early participants refer to users who log in to online courses early, so we think these users are likely to be false. We set the deadline at least 30 minutes before class. In the real world, its normal to see some students in class more than 30 minutes in advance, but on the Internet we dont expect that. This is similar to logging in to a video conference more than 30 minutes in advance. However, for whom to learn, early participants are not unusual.
Group control -- confirmation of the former manager
A former learning manager confirmed our observation on the false user model. He showed a detailed understanding of who he was learning about robotics. The business began in 2015, he said.
He said he learned who to use the software nickname group control (group control software) to control the robot network. Group control, he said, clearly improves attendance. The ability to control the robots login mode suggests that learning from someone may be considering how to camouflage its robot activity.
The back end of group control obviously has tools to guide students attendance, such as arranging robot login and determining login mode. As a typical robot farm, one or more servers are used to control more than 500 to 1000 mobile phones (IMEIs). Each device will have a separate mobile number, wechat number, and be programmed to buy products, or attend a course, and so on.
Learn from others and use external companies to operate robot networks. According to the former manager, these companies usually get about 2% to 5% commission as compensation according to the required tasks. Some companies offer classes. Some people sign up for courses and pay for them. Learning from others clearly provides the cash needed to legitimize transactions, and records most of the costs of generating these robots through sales and marketing expenses or sales line costs. The former manager mentioned three independent companies that offer users of robots to whom to learn, including Weishi (who to learn an application) and Baijia Youlian (who to learn 30% of investors).
Heres a 2.5 minute uninterrupted clip that provides some particularly interesting details:
Former manager: everyone has his own computer room. There are more than ten thousand such machines in one computer room, which we call group robots. Its OK for a person to control more than one thousand mobile phones. Then I can control all the machines remotely or in the computer room. Then I can simulate real students or real ones Shopping data, this has been very mature technology.
Who do you want to learn from? Does it mean that they have a small team to operate?
Former manager: Yes. There has always been a team.
Former manager: No, we have it since 2015. Because we did o2o at that time, we gave a lot of organization drainage, at that time, the students were very few, we asked the teachers to feel that there were many people in class, especially at the beginning, we had such technology, for example, only reported five students, and the remaining 500 we had organic people to go, let the traffic be very large, go to the Internet to listen to classes, let them feel that the traffic of this platform was very large From the beginning of 2015.
How do I do this? I dont know much about this. Is it to give them a code to buy courses for free? Or how will this small company pay for this tuition?
Former manager: Yes, a commission of 20000 yuan. You have to purchase my courses through these virtual mobile phone numbers or wechat accounts. Thats what it looks like. This is part of it. Its a loss of at least 2% to learn from someone, right. This is a small part. The other is to cooperate with some teachers For example, I will also give you a million yuan, and then I will sign the market fee to sell yours. Then you have to sell back a million yuan. Why do these small institutions do this? Its because you help me to operate this action. When you are on the platform or at the beginning of a micro teachers class, I can do free promotion for you on the platform, or give you some advertising space, and then or I can give you some promotion on the platform, but you have to buy a million yuan back. Generally, you wont give him money. Its it. You invest a million yuan back, and then I will supply you with advertisements in the market, or how to return the money to you. This is the pattern. Who to learn from at this time will not lose money. Just turn a million dollars, and Ill trade the advertising space for the one you swipe.
Summary of methods of data acquisition and analysis for students and robots
The learning platform has two parts: website and desktop application for students to use. After users register and log in, open the developer tool of Chrome browser, open the network tab, switch to XHR, and then you can see the data transmission between the browser and the learning website. In these data, there is a large amount of information, including the archive of each purchased courseware, which is also the data we use in the analysis of robot activities on two platforms. No special tools or techniques were used for analysis.
In terms of design and function, gaotu classroom is very similar to who learns website, sharing some domain names and resources with who learns website. After setting the account and purchase category, you can see that some data flows in the browser, which is very similar to who you learn from. However, the data of gaotu class can not show the class reference immediately, so there is an additional necessary step. If you use iPhone, you need to install gaotu classroom app on your phone, and you need to configure the device to send data through the intercepted HTTP proxy. This method enables our data analysts to view the information between the mobile phone and the learning server, see what data is passed back and forth, and determine the path of the class file. After positioning the class files, we download and check the files one by one.
We are surprised to find that there are not only courseware, but also a lot of information about users, including:
1. User number
4. Head portrait
5. User type (0,1,2)
6. Course ID and / or class ID (only class ID is available in gaotu class)
7. Time to join and exit class
8. IP address
In the class data, we find the user number (also shown in the class attendance record), but there are also user types. By cross referencing the class data with the data displayed on the teacher and tutor pages on the learning website with whom, we found that in our database, 100% of the teachers listed on the learning website with whom are marked as type 1 users, and 100% of the counselors listed on the learning website with whom are marked as type 2 users. For a more in-depth explanation of this approach, see Appendix 2.
In our user database, there are 29 class 1 users (teachers), 371 class 2 users (counselors) and 53694 Class 0 users. Since we determined who the users of class 1 and class 2 were learning from, we concluded that the remaining Class 0 records were all student users (non teachers, non counselors).
On March 3, 2020, chairman Chen Xiangdong pledged 6 million class B common shares through its entity, ebeter International Group Ltd. This is equivalent to 9 million American depository receipts with a market value of 319 million yuan. Considering that learning from others is almost complete fraud, the pledge brings greater risk of sudden loss to the long-term holders of learning from others. We dont rule out the possibility that he also mortgaged other shares.
Appendix 1: analysis method of student and robot activities
Who to learn from
This platform has both websites and desktop applications that students can use. We didnt analyze any mobile applications on this platform because it didnt need to. After users register and log in, open the developer tool of Chrome browser, open the network tab, switch to XHR, and then you can see the data transmission between the browser and the learning website. In these data, there is a large amount of information, including the archiving of each lesson purchased by the user, which is also the data we use in the analysis of robot activities on these two platforms.
To discover the archive, you need to complete the following steps in Chrome browser:
1. Log in to your account.
2. Click on any closed course.
3. Click the course view button (see the screenshot below), and then select play V2 in the chrome developer network view. Now look for the pcurl value for each course. Copy each item.
5. Click the getplaybackinfov2 item in the network view to view the data on the right panel.
6. Continue to view the data until you see the package_ Signal entry. This will contain the URL of the archive file:
7. Now you can download the compressed file and use the native decompression tool in 7-Zip or Mac OS X on windows to view the content.
In terms of design and function, gaotu classroom is very similar to who learns website, sharing some domain names and resources with who learns website. Once the user has registered an account and purchased the course, he can start to see some data flow in the browser, much like learning from the data on the website. However, there is no reference to the course files on the learning website, so further analysis is needed.
We have installed the gaotu classroom application on the iPhone and configured the blocking HTTP proxy to send data. In this way, we can observe the information between the mobile phone and the learning server, and observe the data transferred back and forth. Through the analysis, we get that the path of gaotu classroom is the same as that of who learns the website archive.
After finding the class file, we immediately downloaded the file of each purchased class, opened the file, and checked the content. We are surprised to find that there are not only class information, but also detailed information such as student information and joining time.
Open and check all.json File, you can get the information of each student, their user number, IP address and time stamp. In addition, there are other information about when students quit the classroom, as well as other information related to teachers actions.
When you view the recorded course, the browser will also retrieve all.json Documents.
Identify teachers by user number
On who to learn website, each teacher has his own personal homepage to visit. This is different from gaotu classroom, which only has big pictures of all teachers on its platform. After clicking on the teachers profile picture, you will go directly to their profile page.
Note that the number shown in the URL is the teachers user number. Who to learn from? Each user in the website and gaotu class has its own unique user number.
Here we can see that the teachers user number is 330361798. At this time, we can find the number in the class record of the database.
Identify all teachers, mentors, and students in the dataset
Once we have the user number, we are able to make a direct query in the class database to check the classes they are participating in and the connection / leave patterns for those classes. When we look at the records of teachers and tutors from the website, we notice that a user type field appears at the same time genshuixe.com And gaotu100.com.
Then, we perform a query for each specific type of user (types 0, 1, and 2), and then cross check all users to see which users match the same IP address used by types 1 and 2 (known teachers and mentors). The following are examples of query results of types 1, 2, and 0.
Type1 teacher query results:
In the figure above, we see that the user number 813942178 matches Zhang Zhens teacher. In our data, we also have an avatar field (top-down for formatting). This is a picture of an account. For the above users, the avatar URL is: https://imgs.genshuixue.com/176512378_ yc9r2 tpn.png
Its about Genshuixue.com The teachers face and head image on the teachers page match.
Type2 tutor query results:
In the figure above, we see that the user number is 330361798, and the tutor is named B Yue Yuhao ~ Xiaoyu teacher. B seems to indicate that they were sent to the Beijing office. The users image URL is: https://imgs.genshuixue.com/114226777_ z0vg9 fku.jpeg
Its about Genshuixue.com The teachers face and head image on the teachers page match.
We cant find students files by finding teachers. On the contrary, we have adopted Genshuixue.com Collected comments to help verify that the type0 user is a student. We collected 948158 comments, from which we found 5789 users commented in our user activity database, and wrote a total of 29245 comments. Among them, 5787 (99.9%) were students (type0), and only 2 (0.03%) were from teachers (Type1) accounts. We didnt find any comments from the mentor (type 2). Therefore, we are very confident that the user of type0 is a student. HKEx Information Services Limited, its holding companies and / or any of their subsidiaries make every effort to ensure the accuracy and reliability of the information provided, but cannot guarantee its absolute accuracy and reliability, and will not be liable for any loss or damage caused by any inaccuracy or omission (whether under tort law or contractual liability or otherwise Source: Wang Xiaowu, editor of Zhitong finance and Economics Network_ NF
We cant find students files by finding teachers. On the contrary, we have adopted Genshuixue.com Collected comments to help verify that the type0 user is a student. We collected 948158 comments, from which we found 5789 users commented in our user activity database, and wrote a total of 29245 comments. Among them, 5787 (99.9%) were students (type0), and only 2 (0.03%) were from teachers (Type1) accounts. We didnt find any comments from the mentor (type 2). Therefore, we are very confident that the user of type0 is a student.
HKEx Information Services Limited, its holding companies and / or any of their subsidiaries make every effort to ensure the accuracy and reliability of the information provided, but cannot guarantee its absolute accuracy and reliability, and will not be liable for any loss or damage caused by any inaccuracy or omission (whether under tort law or contractual liability or otherwise Responsibility)