Media observes algorithmic news, embraces artificial intelligence and guards against "technological hegemony"

  Editor’s note:The rise and wide application of artificial intelligence technology has increasingly placed us in a world surrounded by big data and algorithms. The power of algorithms is everywhere, from quantitative analysis of stock trading to music creation, from intelligent recommendation of shopping websites to autonomous driving, there are traces of algorithm operation everywhere. The algorithm can decide whether a person’s loan application is approved or not, and it can also decide what kind of push you see when you open your mobile phone to browse the news. In the first issue of Media Watch in 2019, Ao Peng, a Ph.D. candidate from School of Journalism and Communication, Peking University, combed the cutting-edge application practice of the algorithm in the current European and American digital media environment, and discussed how it affected and changed the news production process, as well as the problems that could not be ignored in this process, such as news value judgment, objectivity and algorithm responsibility.

  Algorithm news, embracing artificial intelligence and guarding against "technological hegemony"

  Aopeng

  As a brand-new intermediary in the process of news production, the algorithm has brought a brand-new paradigm revolution to the production of public information and knowledge represented by journalism, brought breakthroughs and challenges in the methodology and conceptual cognition of traditional journalism, and inspired people to rethink what news is in a brand-new digital environment, and how news, which bears the function of public knowledge production, should embrace change and stick to it firmly at the same time.

  Anchoring News: Finding Fact Clues from Massive Information

  In the aspect of finding news clues, the algorithm, as a data-driven intuitive radar, mines data through a series of machine deep learning functions such as real-time monitoring and cluster analysis, helping reporters to quickly lock in valuable information in a complicated information environment. Through the quantitative analysis of data, the algorithm can show the hidden characteristics or problems in the depth of information through the surface clutter, and help guide human journalists to pay attention to valuable information clues, thus producing more meaningful reports. For example, the BBC’s R&D lab has developed an application called Data Stringer in Github to help journalists monitor the real-time updates and changes of different databases, and give reporters tips when the unemployment population and crime rate surge in a certain area at a certain time, which has become a key starting link in the news production chain. Reuters has developed a special social platform monitor Tracer, which uses various data mining capabilities to help journalists pay attention to the trend of large-scale content information on social media in real time. In addition to the powerful monitoring and early warning function, the algorithm can also find unexpected clues through the systematic in-depth analysis of the usual data. The most well-known case of this application is BuzzFeed News’ investigation report on the tennis match fraud scandal in 2016, The Tennis Racket. The reporter made an in-depth exploration of the gambling data and game data of 26,000 professional tennis matches between 2009 and 2015.The existence of players’ cheating behavior is found from the abnormality of data. In this process, the algorithm provides a more objective and reliable empirical basis for the excavation of valuable news clues.

  Deep learning: precise analysis and verification of materials.

  Algorithm-led deep excavation can help journalists to understand and control the increasingly extensive data and materials, provide journalists with a brand-new reporting perspective or conduct a deep and all-round analysis of events, and verify the reliability of source information. From the current application, there are three main types of algorithm deep excavation: supervised machine learning, unsupervisedmachine learning and reinforcement learning. Supervised machine learning relies on tagged data to establish a classification and regression system, which can reveal the relationship between data, deeply analyze data information, help journalists dig deeper into the reality behind events and gain a more unique interpretation perspective. For example, in 2016, the Atlanta Constitution newspaper excavated and analyzed more than 100,000 institutional documents on the issue of sexual assault by doctors, and found that there was a widespread fact that doctors continued to practice normally after sexual assault misconduct. Unsupervised machine learning does not depend on preset labels, and can be used to reveal the unexpected connections between many things, and to dig out the internal connection features through unrelated information representations. Reinforcement learning(reward learning) tries to maximize the reward function every time the algorithm makes a decision, and find the best method in a specific situation, such as being applied to test different headlines to find the best one. These three types play an important role in the field of news material analysis. They are independently or cross-applied in different situations, effectively helping the analysis and processing of news materials, trend prediction and fact checking in the process of news production, and improving the accuracy and depth of news reports.

  In addition to the deep analysis of information data, the deep mining function of the algorithm is now more widely used in information verification and fact verification in the news production process, helping to identify the authenticity of news and sources and identify fake news. A set of semantic analysis algorithm system developed by the research teams of the University of Michigan and the University of Amsterdam in recent research in 2018, the accuracy of identifying false news can reach 76% at its best, while the accuracy of identifying false news by human beings is about 70%. The algorithm technology for identifying various forms of information in the digital age is also constantly advanced. The complex machine learning algorithm developed in the research project of In VideoVeritas can help identify false images and videos in the network communication space, with an accuracy rate of 92%. However, it is still a very challenging thing to completely rely on algorithms to check information. Although many websites such as Politifact, Factcheck.org, Fullfact and other fact-checking organizations are actively exploring the use of algorithms to automatically identify information, the most effective and widely used method at present is to cooperate with people. Algorithm automation helps manual verification in this process, which is helpful to deal with large-scale information efficiently.

  Automatic reporting: faster, wider and better news production.

  If locating news clues and in-depth material analysis are just algorithms that provide instrumental support for news production, then automated news writing is a real direct production of finished news, which has become the most violent part of traditional news production.

  First of all, algorithm-led automated reporting can help improve the speed of news production. In recent years, automatic robot news writing has been widely used in the fields of finance and economics, sports, weather forecast, breaking news and other fields with simple information dissemination content and fast dissemination speed, especially in the field of weather forecast, which has a history of more than 20 years.

  Secondly, automated news greatly broadens the coverage of news media. Associated Press, a veteran news agency with a history of more than 170 years, generated more than 3,700 reports in each financial reporting quarter by algorithm in 2017, covering most American stocks with a market value of $75 million. This number of reports is 10 times that without automatic generation, which greatly broadens the scope and types of reports compared with the traditional model. The algorithm makes many parts that journalists can’t pay attention to because of their limited time and energy be presented in the foreground, and get the opportunity to meet the audience.

  Has it shaken the essence and value judgment of news?

  Algorithm news has realized the generation of large-scale and massive news reports through the non-stop data processing ability of mechanization, which has reached an unprecedented breadth in the coverage of news content and objectively created more news information production. At the same time, however, for the audience, the time for receiving and digesting news is limited, and only a small part of the increasing mass information can be seen and realized. Therefore, we have to return to the initial definition of news, what is news, what is the value of news, and what kind of information deserves attention, should be reported and disseminated. Whether the information generated under the guidance of algorithmic logic can still be regarded as news and whether it still has news value.

  When the algorithm automatically generates information, news is often retrieved according to data commands. In the process of large-scale information production according to the logic of the algorithm, it is difficult to quantify the strain and professional intuition of human journalists in practice into specific data judgment indicators, which leads to the fact that although the algorithm can generate a large number of news, the news value of many news is really debatable, which undoubtedly brings more screening difficulties to the audience. On the other hand, the algorithm, as an auxiliary function, can really help human journalists locate news clues in the massive information, but if the whole workflow is dominated by the algorithm and the reporter’s attention direction is guided according to the algorithm logic, then the whole model itself will guide the reporter’s attention to a specific direction, but at the same time, will it also make reporters give up paying attention to more meaningful clues in other directions? When news organizations use algorithms to conduct data mining to guide news discovery, they essentially allow algorithms to give priority to news value judgment. This process is actually shaped by algorithms, which further affects what kind of news the audience will consume. When the algorithm influences the judgment and choice of news production, the essence and value of news are bound to be strongly impacted and challenged in this process.

  Will algorithm-led news production be more objective?

  In the field of news dissemination and distribution, the algorithm will lead to "echo chamber effect" and "filtering bubbles", which has always been the main evidence that the algorithm will cause prejudice. But in the field of news production, can the seemingly objective algorithm completely avoid prejudice? In 2018, Knowhere, an AI startup website in the United States, declared that artificial intelligence can be used to write fair and unbiased news. This website crawls information through big data mining and deep learning of news news and rewrites it with automated algorithms, and provides three versions of each news on the website: left-leaning view version, right-leaning view version and neutral version. Its algorithmic news application innovation won the favor of capital, and the company received an investment of 18 million US dollars in 2018. However, in this process, the algorithm is still getting data from human judgment. Every article advertised as "neutral" will also join the reader survey, so that the audience can score the neutrality bias of the article according to their own subjective feelings and submit it to the system background. In fact, the algorithm is constantly collecting big data to try to learn people’s subjective judgment tendency. Relying only on the algorithm to make neutral value judgment seems to be a long process that still needs to be improved in practice. When we return to the essence of the working principle of the algorithm, we will find that it is difficult for the algorithm to be more neutral and unbiased than traditional news, regardless of whether to divest the labor of human journalists in the whole workflow.

  How to solve the responsibility problem in algorithmic news?

  With the extensive application of the algorithm in the field of news production, the responsibility problem caused by the algorithm can not be ignored day by day. Especially when the algorithm plays an increasingly important role in many decision-making aspects of news production, how to evaluate, supervise and adjust the power of the algorithm has become an urgent problem to be solved. As a new power intermediary, the greater its influence in news production, the greater its corresponding responsibility. The algorithm itself is not perfect, it is unreliable to some extent, and it needs to be corrected all the time. As a product designed by artificial labor, it needs to be constantly debugged and modified. No algorithm can work once and for all. Google has to modify its search engine algorithm 500-600 times a year on average. Algorithms in news production often lead to the risks of inaccuracy, decision-making, prejudice and privacy, which leads to the question: Who should bear the responsibility for the errors and deviations caused by algorithms or the adverse consequences caused by decisions made according to algorithms in news production, which is the algorithm developer, product designer or news decision-maker? At the same time, should the algorithm be completely responsible for defining the occurrence of an adverse consequence? All these put forward a new problem of responsibility allocation for algorithmic news production. How much responsibility should the algorithm bear in the process of news production, and how to bear the responsibility, how news organizations can conduct self-examination and rectification of the algorithm at all times, and how the government and relevant regulatory agencies can intervene in the algorithm supervision and sanctions system of news organizations and enterprises.A series of problems have posed new challenges to news production and government supervision of news organizations.

  (Published in the January 2019 issue of Media Watch, the original text is about 10,000 words, with the title: Frontier practice and problems of algorithmic news production and its enlightenment to news education. This article was recommended by Xinhua Digest in the 11th issue of 2019. Figures, notes, etc. are omitted. Please refer to the original text for academic citations. )

  [Author] Ao Peng, Ph.D. candidate, School of Journalism and Communication, Peking University.