The 60th anniversary of the birth of artificial intelligence, the future development of machine learning

In 1952, Arthur Samuel developed a Westward Checkers program at IBM. This program has self-learning ability to gradually identify "good chess" and "bad chess" in each game by analyzing a large number of games. Raise the level of checkers on the machine and quickly win Samuel himself. In 1956, at the Dartmouth Artificial Intelligence Conference 60 years ago, Samuel introduced his work and invented the term "machine learning."

In the history of computational science, "machine learning" has many definitions. The general view is that "machine learning" is the use of data and "experience" by computers to improve the performance of the algorithm system itself. Stanford University's definition of machine learning is the science that allows computers to act on their own without explicit programming instructions. After the "learning algorithm" model is generated from the existing data, it is applied to the new data set to make a strategic judgment on the new situation. This is called "prediction". It can be said that machine learning is the science of "learning algorithms", while artificial intelligence is the research and development of intelligent machines.

2016 is the 60th anniversary of the birth of artificial intelligence. In the last decade, with the development of big data and cloud computing (large-scale computing), machine learning has entered a golden period of development. On December 17, 2016, at the 2016 Machine Intelligence Frontier Forum, Chinese and foreign experts discussed the future development and prospects of machine learning.

Machine learning enters the golden age of development

Traditionally, artificial intelligence is divided into two steps: one is the representation and expression of data, and the other is the process of predicting and making decisions through algorithms. Traditional artificial intelligence is a semantic-based way to represent or express data. From data representation to prediction, it is often through rule-based logical reasoning. A typical representative is the expert system, which is also the first generation of machine learning.

For first-generation machine learning, the definition of rules is very important. Once the rule definition is inaccurate or problematic, it will lead to incorrect logical reasoning. In addition, rule-based models are effective for shallow reasoning, but they cannot be used for deeper reasoning. Therefore, the second generation of machine learning, that is, machine learning based on statistical models, has been developed.

In the book "Machine Learning" by Professor Zhou Zhihua of Nanjing University, there is a more subdivided division and corresponding algorithm introduction for the development stage and history of machine learning. In fact, in the mid-1990s, â€œstatistical learningâ€ began to take the stage of history and quickly became the mainstream. The representative technologies were Support Vector Machine (SVM, Support Vector Machine) and Kernel Methods. And statistical learning has become the mainstream, because the bottlenecks in the previous neuron network research (mainly because the parameters were set by hand at the time), scholars turned their attention to statistical learning.

With the rise of statistical learning, there has been a golden decade of machine learning. Statistical learning is not only used for algorithm modeling, but also for the representation and expression of data, thus weakening the requirements for relevant background knowledge. For example, computer vision and image research belong to the field of computer science, and its background knowledge is relatively easy to obtain. Natural language processing requires knowledge of linguistics in English or Chinese, which is difficult for computer experts.

In a wider range of applications, statistical pattern recognition is replacing expert rules for data representation, reducing the barriers to entry for artificial intelligence and machine learning. In this way, from data representation to learning algorithm to inference prediction, all can be realized by machine learning algorithm, which enters the third generation machine learning stage, that is, from data directly to intelligent end-to-end machine learning. Of course, with the advent of big data and cloud computing, deep learning represented by complex neural networks can also be used for data representation and expression.

It can be seen that statistical learning for numerical calculations and deep learning represented by neural networks are the two main branches of modern artificial intelligence. In the era of big data + cloud computing, these two branches have entered a new golden period of development.

Confrontational network and dual learning

As can be seen from the foregoing, machine learning is divided into three phases, namely, data acquisition and generation in the first phase, learning algorithm in the second phase, and inference prediction in the third phase. Under the premise of big data and cloud computing, these three stages have new developments.

In the data acquisition and generation phase, the latest research is to use the confrontation network to generate more data. That is to say, if the real world cannot obtain more data, machine learning algorithms are used to simulate more real-world data for subsequent "training" learning algorithms.

The GeneraTIve Model is the modeling of data, that is, training the model with a large amount of data, in the hope that this model can produce more similar data. Common methods of generation include mixed Gaussian models, naive Bayesian methods, and stealth Markov models. Taking the mixed Gaussian model as an example, although this method has strong ability to approximate data distribution, it is a modeling method that is very suitable for probability density estimation. However, this method is not enough to describe complex data, so the neural network is used.

For a neural network that can generate data, how do you train it to make the resulting data closer to real data? This has led to the creation of the GeneraTIve Adversarial Network (GAN). For example, a neural network is used to generate data, and another neural network is used to determine whether the data is true, and then the data generation model is optimized by the difference between the results of the two networks. In the well-known AlphaGO training algorithm, a similar concept is also used to generate a new game for training the computer to achieve self-game.

One improvement for deep neural network-based generation networks is how to use a large amount of unlabeled real data. Because when training a neural network, a large amount of manually labeled data is needed to train the algorithm, such as labeling a picture as "flower". Zhu Jun, an associate professor at Tsinghua University and an adjunct professor at Carnegie Mellon University, introduced a variety of methods for using unlabeled data.

Zhu Jun also introduced a generation network that introduces the "AttenTIon Focus" and "Memory Memory" mechanisms for generating near-real data based on limited highly abstract parameters. For example, in the process of generating a new image, the original neural network model highly abstracts the real image parameters, and in the process loses a lot of detail data, in the process of using another neural network to generate new images, If you only use highly abstracted image parameters, you can't express more image detail. Therefore, in the process of learning real image data in the early stage, a "Memory Memory" mechanism can be introduced to store some image detail data for subsequent generation of images closer to reality.

In addition to the limitations of unlabeled data in the data generation process, unlabeled data is also needed throughout the end-to-end machine learning algorithm. This means how to use the unlabeled data in a large amount in the training learning algorithm and predictive inference phase of machine learning. Liu Tieyan, Principal Investigator of Microsoft Research Asia, introduced dual learning, which allows supervised learning and unsupervised learning algorithms to make extensive use of unlabeled data, and on the other hand can improve the learning speed of enhanced learning algorithms.

The so-called Dual Learning is to use a dual closed task to form a closed-loop feedback system, so that feedback information can be obtained from the unlabeled data, and then the feedback information can be used to improve the two machine learning models in the dual task. For example, two dual tasks can be Chinese and English, which involves two machine learning models. The process of dual learning is to translate a sentence from Chinese to English and then from English to Chinese. Compare the difference between the generated Chinese and the original Chinese to improve the performance of the two machine learning models. In this scenario, there is no need to label data, that is, you don't need to know the "right" and "wrong" of generating Chinese sentences. You only need to know the difference between the generated Chinese sentences and the original Chinese sentences to optimize the algorithm model.

Deep question and answer challenge

Since IBM Watson won the quiz TV show "Jeopardy!" in 2011, deep question and answer as an important application of artificial intelligence and machine learning has attracted worldwide attention. Deep question and answer involves not only natural language understanding (NLP), but also techniques for analyzing and understanding problems, evaluating candidate answers, and selecting strategies, rapid retrieval and search.

The man-machine question and answer can be traced back to the Turing test, and the Turing test is conducted in the form of a question and answer. By the end of 1960, template-based QA (Template-based QA), information retrieval-based QA (IR-based QA) appeared around 1990, and community-based QA (Community-based QA) appeared around 2000. QA (Large Scale KB-based QA) based on large-scale knowledge base, and today's reading comprehension QA.

Liu Kang, an associate researcher at the Institute of Automation of the Chinese Academy of Sciences, introduced the deep semantic analysis and reading comprehension deep semantic understanding and knowledge reasoning based on knowledge base. In the knowledge base-based QA, the core problem is how to solve the mapping relationship between the problem text and the knowledge base, especially to eliminate ambiguity. The problem to be solved by modern machine learning methods is to distribute or model the knowledge of the problem text and the knowledge base separately. However, a correspondence model based on the deep neuron network is established between the two to achieve precision. Mapping. And why do distributed modeling of problems and knowledge, mainly because the information on the modern Internet is massive, so understanding the problem text and understanding knowledge requires large-scale distributed modeling.

Reading comprehension QA is another form of question and answer. The so-called reading comprehension QA is to give a paragraph, and then answer the question based on the information of the passage. This actually requires synthesizing a few sentences in a paragraph, followed by deep reasoning and joint reasoning. The current deep learning method can not solve the challenge of knowledge reasoning, nor can it replace the symbol-based logical reasoning in traditional artificial intelligence. In addition, in solving the problem of reading comprehension, in fact, common sense knowledge is needed, but where is the boundary of common sense knowledge? Will common sense knowledge change over time? These are the challenges of reading comprehension QA.

Liu Kang believes that the current deep question and answer is changing from the traditional symbol information retrieval matching to the deep semantic understanding. It requires an accurate understanding of the text content, and the reasoning becomes more and more important. At the same time, the question and answer process requires a knowledge base. Especially the support of the common sense knowledge base. In addition, in an open environment, the user's problems are complex and diverse, and a single knowledge base is often insufficient to satisfy the user's problems. It also requires the joint and comprehensive utilization of multiple knowledge sources.

Scientist and headline laboratory technical director Li Lei introduced a QA model, CondiTIonal Focused Neural Question Answering. At the 2016 Annual Meeting of the International Society of Computer Linguistics (ACL), Li Lei et al. published a CFO: Conditional Focused Neural Question Answering with Large-scale Knowledge Bases. . CFO mainly solves the QA problem of single answer in open knowledge base. The best result before CFO is the automatic question answering algorithm invented by Facebook, with an accuracy rate of 62.9% (based on Facebook's Simple Question Dataset public data set, containing 108K The English question is also the largest single answer question set published). The CFO is able to achieve 75% accuracy on this data set, and Facebook has opened a 12% gap.

Today's headlines also tried the news writing robot Xiaomingbot in 2016, an artificial intelligence robot based on big data, natural language understanding and machine learning. Xiaomingbot automatically wrote more than 450 articles during the Rio Olympics in August 2016, generating more than 1 million readings, close to the manuscript readings of sports journalists on the platform during the same period. In addition, Xiaomingbot's manuscripts have both short messages and long reports formed by the game's timeline process, and can also automatically add images.

Of course, when it comes to in-depth questions and answers, you must mention IBM Watson. Su Zhong, Director of Cognitive Computing Research at IBM China Research Institute for Big Data, introduced the Watson model based on massive parallel computing probability algorithm in "Jeopardy!". Now IBM has adopted Watson's algorithmic capabilities based on Bluemix. The IBM Watson Developer Cloud Service is exported externally.

Building a model of intelligent social organization

At the 2016 Machine Intelligence Frontier Forum, the most interesting and socially and commercially valuable machine learning frontier is the intelligent social organization modeling research from Professor David Wolpert of the Santa Fe Institute in the United States. He is also an IEEE Fellow, three books and author of more than 200 papers. His thesis covers physics, machine learning, game theory, information theory, thermodynamics and distribution optimization. David Wolpert proposed the "no free lunch" theorem in 1996 and is now widely used in machine learning.

David Wolpert said that in addition to human beings, human society is also an agent. There are many interactions and activities in human society. Human society can process information, communicate with each other, and quickly understand where information can be obtained and how it can be used. Nowadays, the intelligence of human society as a whole has been stronger than that of the individual brain, so the research on intelligent social organizations is more meaningful, but it is still a brand new field and only preliminary research.

From the perspective of the development of human society, after human organizations reach a certain scale, it is difficult to expand. Just as a company expands to a certain scale, as more and more levels become more and more difficult, it is more difficult to accurately transfer information from the top layer to the bottom layer. As the information is transmitted in the middle layer, the information will be mis-transmitted, which will hinder the enterprise. Further expansion. Of course, human beings are constantly trying to innovate organizational structures. Represented by the modern Internet company Google, the company proposed a flat organizational structure in the initial stage of the company. However, with the continuous expansion of Google's scale, it eventually went to a multi-level corporate structure.

So, how to expand the company's scale in a flat way, so that the communication efficiency of employees is not affected by the increase of the level? This requires finding a way to better communicate information between business organizations. In the process of David Wolpert's simulation of the corporate structure, the employee's social relationship is regarded as an information communication network structure, and employees are similar to nodes in the information communication network. The task of network engineers is to optimize and design the communication path of information communication networks through algorithms. A similar approach is introduced into the research of intelligent social organizations, which is how to implement a more intelligent way of information exchange.

When researching and designing intelligent social organizations, there are many existing theories that need to be considered. One of the famous theories is the Dunbar number, proposed by the British anthropologist Robin Dunbar, the theoretical limit on which a human individual can maintain a stable social relationship with the people around him, usually considered to be 150 people. This is because within a given time frame, the ability of one person to listen to other people at the same time and to say different things to others is limited by the capacity of the cerebral cortex. Once beyond this range, external mechanisms are needed to maintain group stability and cohesion.

Therefore, applying the coding and design principles of network communication to social organizations such as enterprises requires answering three questions: limited by the size of the organization and communication capabilities (analog network scale and edge communication capabilities), able to The maximum amount of information that is communicated to the workers (like the network sender and receiver); how the middle managers (intermediate nodes in the analog network) pass information to ensure that workers can receive information maximally (the analogy of the network's maximum throughput); Increase the revenue of middle managers (intermediate nodes in the analog network) in the enterprise.

David Wolpert has studied a variety of mechanisms for this purpose, especially the mechanism of mail transmission in the network. Because in a complex network environment, a mail can be quickly transferred from one sender to another. In the middle, the mail is divided into different small packets, and these small packets are quickly relayed through different networks and different hosts. Passed to the terminal, the terminal node then restores the small data packet to a mail.

From David Wolpert's research, you can get some interesting findings. For example, social media is used as a third-party public public node to store part of the information, ensuring that part of the information of the enterprise organization can be accurately and efficiently transmitted to all levels, and each level is combined according to other information transmitted internally, thereby accurately and personalizedly "Restore" the original information. This is actually the model of social organization, which is equivalent to the transfer of some nodes from the internal to the society, and correspondingly the socialization of the organizational part. Of course, this theory also verifies the inevitable logic of the rise of social media or social media from another angle.

Immersion Cooling

This liquid-cooled cabinet uses stable and customized insulating oil. The circulating insulating oil can absorb the heat generated by the high-power heating equipment immersed in the oil. The final heat is transferred by the high-efficiency water-cooling device. No fans are involved in the whole process (remove the cooling fan and the power supply fan) , Oil circulation uses silent high temperature resistant oil pump, efficient water circulation heat dissipation, achieves the best energy efficiency ratio of machine and liquid cooling cabinet, quiet and stable, and brings higher benefits to your business!

4 Advantages of immersion oil cooling system

1. +50% profit

Overclocking and fine-tuning with custom firmware effects from 40% to 50% more profit. It may require mechanical adaptation, additional overclocking PSU, or custom immersion firmware!

2. 80% Reduced failure rate

Reducing the influence of high temperature, humidity, dust, and vibration can reduce the equipment failure rate by 80% and the equipment operation and maintenance cost by 85%

3. Easy to install &Safe

The unit's design focuses on simplicity and safe operation. You can operate the system immediately upon receiving it, which reduces your construction and installation costs and MAXIMIZES YOUR PROFITS.

4. Noiseless running

Immersion in the liquid fully protects from temperature, humidity, dust and vibration or fire.
No fans = no noise, more profit , less downtime & less maintenance.

immersion cooling,oil immersion cooling,immersion cooling mining,immersion cooling asic,immersion cooling s19

Shenzhen YLHM Technology Co., Ltd. , https://www.apgelectrical.com