ibm watson research articles

What Ever Happened to IBM’s Watson?

IBM’s artificial intelligence was supposed to transform industries and generate riches for the company. Neither has panned out. Now, IBM has settled on a humbler vision for Watson.

Credit... Video by Maria Chimishkyan

Supported by

  • Share full article

Steve Lohr

By Steve Lohr

  • Published July 16, 2021 Updated July 17, 2021

A decade ago, IBM’s public confidence was unmistakable. Its Watson supercomputer had just trounced Ken Jennings , the best human “Jeopardy!” player ever, showcasing the power of artificial intelligence. This was only the beginning of a technological revolution about to sweep through society, the company pledged.

“Already,” IBM declared in an advertisement the day after the Watson victory, “we are exploring ways to apply Watson skills to the rich, varied language of health care, finance, law and academia.”

But inside the company, the star scientist behind Watson had a warning: Beware what you promise.

David Ferrucci, the scientist, explained that Watson was engineered to identify word patterns and predict correct answers for the trivia game. It was not an all-purpose answer box ready to take on the commercial world, he said. It might well fail a second-grade reading comprehension test.

His explanation got a polite hearing from business colleagues, but little more.

“It wasn’t the marketing message,” recalled Mr. Ferrucci, who left IBM the following year.

It was, however, a prescient message.

IBM poured many millions of dollars in the next few years into promoting Watson as a benevolent digital assistant that would help hospitals and farms as well as offices and factories. The potential uses, IBM suggested, were boundless, from spotting new market opportunities to tackling cancer and climate change. An IBM report called it “the future of knowing.”

IBM’s television ads included playful chats Watson had with Serena Williams and Bob Dylan. Watson was featured on “60 Minutes.” For many people, Watson became synonymous with A.I.

And Watson wasn’t just going to change industries. It was going to breathe new life into IBM — a giant company, but one dependent on its legacy products. Inside IBM, Watson was thought of as a technology that could do for the company what the mainframe computer once did — provide an engine of growth and profits for years, even decades.

Watson has not remade any industries. And it hasn’t lifted IBM’s fortunes. The company trails rivals that emerged as the leaders in cloud computing and A.I. — Amazon, Microsoft and Google. While the shares of those three have multiplied in value many times, IBM’s stock price is down more than 10 percent since Watson’s “Jeopardy!” triumph in 2011.

The company’s missteps with Watson began with its early emphasis on big and difficult initiatives intended to generate both acclaim and sizable revenue for the company, according to many of the more than a dozen current and former IBM managers and scientists interviewed for this article. Several of those people asked not to be named because they had not been authorized to speak or still had business ties to IBM.

Manoj Saxena, a former general manager of the Watson business, said that the original objective — to do pioneering work that was good for society — was laudable. It just wasn’t realistic.

“The challenges turned out to be far more difficult and time-consuming than anticipated,” said Mr. Saxena, who is now executive chairman of Cognitive Scale, an A.I. start-up whose investors include IBM.

Martin Kohn, a former chief medical scientist at IBM Research, recalled recommending using Watson for narrow “credibility demonstrations,” like more accurately predicting whether an individual will have an adverse reaction to a specific drug, rather than to recommend cancer treatments.

“I was told I didn’t understand,” Dr. Kohn said.

The company’s top management, current and former IBM insiders noted, was dominated until recently by executives with backgrounds in services and sales rather than technology product experts. Product people, they say, might have better understood that Watson had been custom-built for a quiz show, a powerful but limited technology.

ibm watson research articles

IBM describes Watson as a learning journey for the company. There have been wrong turns and setbacks, IBM says, but that comes with trying to commercialize pioneering technology.

“Innovation is always a process,” said Rob Thomas, the executive in charge of the Watson business in the past few years. Mr. Thomas, who earlier this month was named senior vice president for global sales, sees the A.I. development at IBM in three stages: the technical achievement with “Jeopardy!," the years of “experimentation” with big services contracts and, now, a shift to a product business.

IBM insists that its revised A.I. strategy — a pared-down, less world-changing ambition — is working. The job of reviving growth was handed to Arvind Krishna, a computer scientist who became chief executive last year, after leading the recent overhaul of IBM’s cloud and A.I. businesses.

But the grand visions of the past are gone. Today, instead of being a shorthand for technological prowess, Watson stands out as a sobering example of the pitfalls of technological hype and hubris around A.I.

The march of artificial intelligence through the mainstream economy, it turns out, will be more step-by-step evolution than cataclysmic revolution.

A New Wave to Ride

Time and again during its 110-year history, IBM has ushered in new technology and sold it to corporations. The company so dominated the market for mainframe computers that it was the target of a federal antitrust case. PC sales really took off after IBM entered the market in 1981, endorsing the small machines as essential tools in corporate offices. In the 1990s, IBM helped its traditional corporate customers adapt to the internet.

IBM executives came to see A.I. as the next wave to ride.

Mr. Ferrucci first pitched the idea of Watson to his bosses at IBM’s research labs in 2006. He thought building a computer to tackle a question-answer game could push science ahead in the A.I. field known as natural language processing, in which scientists program computers to recognize and analyze words. Another research goal was to advance techniques for automated question answering.

After overcoming initial skepticism, Mr. Ferrucci assembled a team of scientists — eventually more than two dozen — who worked out of the company’s lab in Yorktown Heights, N.Y., about 20 miles north of IBM’s headquarters in Armonk.

The Watson they built was a room-size supercomputer with thousands of processors running millions of lines of code. Its storage disks were filled with digitized reference works, Wikipedia entries and electronic books. Computing intelligence is a brute force affair, and the hulking machine required 85,000 watts of power. The human brain, by contrast, runs on the equivalent of 20 watts.

All along, the company’s goal was to push the frontiers of science and burnish IBM’s reputation. IBM made a similar — and successful — bet with its chess-playing Deep Blue computer, which beat the world chess champion Garry Kasparov in 1997. In a nod to the earlier project, the scientists originally called their A.I. computer DeepJ! But the marketers stepped in and decided to name the machine for IBM’s founder, Thomas Watson Sr.

When Watson triumphed at “Jeopardy!,” the response was overwhelming. IBM’s customers clamored for one of their own. Executives saw a big business opportunity.

Clearly, there was a market for Watson. But there was a problem.

IBM had little to sell.

A Health Care ‘Moon Shot’

Executives got to work figuring out how to turn a business out of its new star. One possibility kept coming up: health care.

Health care is the nation’s largest industry and spending is rising worldwide. It is a field rich in data, the essential fuel for modern A.I. programs. And the social benefit is undeniable — the promise of longer, healthier lives.

Ginni Rometty, IBM’s chief executive at the time, described the big bet on health care as the next chapter in the company’s heritage of tackling grand challenges, from counting the census to helping guide the Apollo 11 mission to the moon.

“Our moon shot will be the impact we have on health care,” Ms. Rometty said. “I’m absolutely positive about it.”

IBM started with cancer. It sought out medical centers where researchers worked with huge troves of data. The idea was that Watson would mine and make sense of all that medical information to improve treatment.

At the University of North Carolina School of Medicine, one of IBM’s partners, the difficulties soon became apparent. The oncologists, having seen Watson’s “Jeopardy!” performance, assumed it was an answer machine. The IBM technologists were frustrated by the complexity, messiness and gaps in the genetic data at the cancer center.

“We thought it would be easy, but it turned out to be really, really hard,” said Dr. Norman Sharpless, former head of the school’s cancer center, who is now the director of the National Cancer Institute. “We talked past each other for about a year.”

Eventually, the oncologists and technologists found an approach that suited Watson’s strength — quickly ingesting and reading many thousands of medical research papers. By linking mentions of gene mutations in the papers with a patient’s genetic profile, Watson could sometimes point to other treatments the physicians might have missed. It was a potentially useful new diagnostic tool.

But it turned out to be not useful or flexible enough to be a winning product. At the end of last year, IBM discontinued Watson for Genomics, which grew out of the joint research with the University of North Carolina. It also shelved another cancer offering, Watson for Oncology, developed with another early collaborator, the Memorial Sloan Kettering Cancer Center.

Another cancer project, called Oncology Expert Advisor, was abandoned in 2016 as a costly failure. It was a collaboration with the MD Anderson Cancer Center in Houston. The aim was to create a bedside diagnostic tool that would read patients’ electronic health records, volumes of cancer-related scientific literature and then make treatment recommendations.

The problems were numerous. During the collaboration, MD Anderson switched to a new electronic health record system and Watson could not tap patient data. Watson struggled to decipher doctors’ notes and patient histories, too.

Physicians grew frustrated, wrestling with the technology rather than caring for patients. After four years and spending $62 million, according to a public audit, MD Anderson shut down the project.

“They chose the highest bar possible, real-time cancer diagnosis, with an immature technology,” said Shane Greenstein, a professor and co-author of a recent Harvard Business School case study on the Watson project at MD Anderson. “It was such a high-risk path.”

IBM continued to invest in the health industry, including billions on Watson Health, which was created as a separate business in 2015. That includes more than $4 billion to acquire companies with medical data, billing records and diagnostic images on hundreds of millions of patients. Much of that money, it seems clear, they are never going to get back.

Now IBM is paring back Watson Health and reviewing the future of the business. One option being explored, according to a report in The Wall Street Journal , is to sell off Watson Health.

Back to Reality

Many outside researchers long dismissed Watson as mainly a branding campaign. But recently, some of them say, the technology has made major strides.

In an analysis done for The New York Times, the Allen Institute for Artificial Intelligence compared Watson’s performance on standard natural language tasks like identifying persons, places and the sentiment of a sentence with the A.I. services offered by the big tech cloud providers — Amazon, Microsoft and Google.

Watson did as well as, and sometimes better than, the big three. “I was quite surprised,” said Oren Etzioni, chief executive of the Allen Institute. “IBM has gotten its act together, certainly in these capabilities.”

The business side of Watson also shows signs of life. Now, Watson is a collection of software tools that companies use to build A.I.-based applications — ones that mainly streamline and automate basic tasks in areas like accounting, payments, technology operations, marketing and customer service. It is workhorse artificial intelligence, and that is true of most A.I. in business today.

A core Watson capability is natural language processing — the same ability that helped power the “Jeopardy!” win. That technology powers IBM’s popular Watson Assistant, used by businesses to automate customer service inquiries.

The company does not report financial results for Watson. But Mr. Thomas, who now leads worldwide sales for IBM, points to signs of success.

It is early for A.I. in the corporate market, he said, the market opportunity will be huge and the key at this stage is to hasten adoption of the Watson software offerings.

IBM says it has 40,000 Watson customers across 20 industries worldwide, more than double the number four years ago. Watson products and services are being used 140 million times a month, compared with a monthly rate of about 10 million two years ago, IBM says. Some of the big customers are in health, like Anthem, a large insurer, which uses Watson Assistant to automate customer inquiries.

“Adoption is accelerating,” Mr. Thomas said.

Five years ago, Watson, a nerdy, disembodied voice from the A.I. future, chatted and joked in advertisements with the tennis superstar Serena Williams. Today, the TV ads proclaim the technology’s potential to save time and work in offices and on factory floors.

Watson, one TV ad says , helps companies “automate the little things so they can focus on the next big thing.”

The contrast in ambition seems striking. That’s fine with IBM. Watson is no longer the next big thing, but it may finally become a solid business for IBM.

Steve Lohr has covered technology, business and economics for The Times for more than 20 years. In 2013, he was part of the team awarded the Pulitzer Prize for Explanatory Reporting. He is the author of “Data-ism” and “Go To.” More about Steve Lohr

Explore Our Coverage of Artificial Intelligence

News  and Analysis

News Corp, the Murdoch-owned empire of publications like The Wall Street Journal and The New York Post, announced that it had agreed to a deal with OpenAI to share its content  to train and service A.I. chatbots.

The Silicon Valley company Nvidia was again lifted by sales of its A.I. chips , but it faces growing competition and heightened expectations.

Researchers at the A.I. company Anthropic claim to have found clues about the inner workings  of large language models, possibly helping to prevent their misuse and to curb their potential threats.

The Age of A.I.

D’Youville University in Buffalo had an A.I. robot speak at its commencement . Not everyone was happy about it.

A new program, backed by Cornell Tech, M.I.T. and U.C.L.A., helps prepare lower-income, Latina and Black female computing majors  for A.I. careers.

Publishers have long worried that A.I.-generated answers on Google would drive readers away from their sites. They’re about to find out if those fears are warranted, our tech columnist writes .

A new category of apps promises to relieve parents of drudgery, with an assist from A.I.  But a family’s grunt work is more human, and valuable, than it seems.


Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 11 March 2021

A meta-analysis of Watson for Oncology in clinical application

  • Zhou Jie 1 , 2   na1 ,
  • Zeng Zhiying 3   na1 &

Scientific Reports volume  11 , Article number:  5792 ( 2021 ) Cite this article

12k Accesses

57 Citations

23 Altmetric

Metrics details

  • Medical research

Using the method of meta-analysis to systematically evaluate the consistency of treatment schemes between Watson for Oncology (WFO) and Multidisciplinary Team (MDT), and to provide references for the practical application of artificial intelligence clinical decision-support system in cancer treatment. We systematically searched articles about the clinical applications of Watson for Oncology in the databases and conducted meta-analysis using RevMan 5.3 software. A total of 9 studies were identified, including 2463 patients. When the MDT is consistent with WFO at the ‘Recommended’ or the ‘For consideration’ level, the overall concordance rate is 81.52%. Among them, breast cancer was the highest and gastric cancer was the lowest. The concordance rate in stage I–III cancer is higher than that in stage IV, but the result of lung cancer is opposite ( P  < 0.05).Similar results were obtained when MDT was only consistent with WFO at the "recommended" level. Moreover, the consistency of estrogen and progesterone receptor negative breast cancer patients, colorectal cancer patients under 70 years old or ECOG 0, and small cell lung cancer patients is higher than that of estrogen and progesterone positive breast cancer patients, colorectal cancer patients over 70 years old or ECOG 1–2, and non-small cell lung cancer patients, with statistical significance ( P  < 0.05). Treatment recommendations made by WFO and MDT were highly concordant for cancer cases examined, but this system still needs further improvement. Owing to relatively small sample size of the included studies, more well-designed, and large sample size studies are still needed.

Similar content being viewed by others

ibm watson research articles

Evaluating eligibility criteria of oncology trials using real-world data and AI

ibm watson research articles

Global Consultation on Cancer Staging: promoting consistent understanding and use

ibm watson research articles

Translation of AI into oncology clinical practice


With the rapid development of human society, cancer-related knowledge is also growing exponentially, which has caused a knowledge gap for clinic physicians 1 . With the increasing understanding of each patient, more and more information need to be absorbed from the literature in providing evidence-based cancer treatment. Research shows that clinic physicians can only spend 4.6 h a week to acquire the latest professional knowledge 2 , resulting in a relative delay in information absorption, leading to an increasing gap between the results achieved by academic research centers and the actual situation 3 . However, compared with physicians in other clinical disciplines, clinical oncologists urgently need to acquire evidence-based medicine knowledge in time to support patients' personalized treatment plans. Consequently, clinicians need some new types of tools to bridge this knowledge gap, support and adopt new treatment methods in an evidence-based manner, so that more patients can benefit from social investment in research and development 4 , 5 . Artificial intelligence (AI) first appeared in the early 1950s, which refers to the creation of intelligent machines with functions and reactions like human beings 6 . The goal of AI is to replicate human mind, that is to say, it can perform tasks such as identification, interpretation, reasoning and transformation, and it is good at the areas that human beings are not good at, such as absorbing a large amount of qualitative information that can recognize the patterns of relevant information 7 , 8 . Now AI has gradually entered medicine. Image recognition using AI has been successfully applied to image-based clinical diagnosis, such as melanoma recognition in dermoscopy images 9 or detection of diabetic retinopathy in retinal fundus photographs 10 , and more and more researches on AI are also carried out in oncology 11 , 12 , 13 , 14 . AI aims to enhance human capabilities, enable human beings to apply more and more complex knowledge to clinical decision-making, and bring more and more diversified and complex patient data into personalized management. Due to the recent development of cognitive computing technology, its application in clinical oncology still lacks large-scale data, and there are clinical differences in different regions and ethnic groups. Watson for Oncology (WFO), an artificial intelligence assistant decision system, was developed by IBM Corporation (USA) with the help of top oncologists from Memorial Sloan Kettering Cancer Center (MSK). It took more than 4 years of training, based on national comprehensive cancer network (NCCN) cancer treatment guidelines and more than 100 years of clinical cancer treatment experience in the United States, and can recommend appropriate chemotherapy regimens for specific cancer patients. As for supported cases, the treatment recommendations provided by WFO are divided into 3 groups: Recommended, i.e. green "buckets", which represents a treatment supported by obvious evidence; For consideration, i.e. yellow "buckets", which represents a potentially suitable alternative; and Not recommended, i.e. red "buckets", which stands for a treatment with contraindications or obvious evidence against its use. In order to compare the consistency between WFO and clinicians in different countries and regions in various aspects and on a large scale, many hospitals have formed Multidisciplinary Team (MDT), which is composed of oncologists, surgeons, pathologists and radiologists, etc. They discuss the advantages and disadvantages of each candidate treatment scheme and finally determine the treatment scheme. If the concordance is achieved when the MDT recommendation is in the ‘Recommended’/‘Recommended’ or ‘For consideration’ categories of WFO, it is defined as concordant; Otherwise, it is discordant. The results showed that there were obvious differences in the concordance rate of different regions and types of cancers. And so far, there has been no published meta-analysis comparing the consistency of WFO and MDT. Therefore, this study aims to systematically review the literature and provide the latest evidence of WFO's clinical use, analyze the consistency, advantages and disadvantages between WFO's treatment scheme in cancer patients and that of clinicians, and further summarize and analyze WFO's clinical practice, so as to provide references for further clinical application of WFO.

Materials and methods

This meta-analysis is registered in the International Prospective Register of Systematic Reviews (PROSPERO) trial registry (CRD42020199418). In addition and where applicable, the general guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) Statement were followed. And this study was performed and prepared according to the guidelines proposed by Cochrane Collaboration ( ).

Literature search

Since WFO started commercial use in 2015, literatures from 2015 onwards were searched. Cochrane Library, PubMed, Excerpta Medica Database (EMbase), China National Knowledge Infrastructure (CNKI), CQVIP and Chinese Biomedicine (CBM) databases (updated until December 31, 2019) were searched using the following terms: artificial intelligence, clinical decision-support system, Watson for Oncology, neoplasm, treatment, Multidisciplinary Team, concordance and comparative study. Other potentially qualified articles were also screened manually.

Inclusion and exclusion criteria

The studies meeting the following criteria would be included:

(a) The clinical use of WFO has been focused on regardless of cancer type, (b) the studies contain at least one subgroup of analysis data, (c) the studies should be original research articles published either in Chinese or English regardless of nationality, (d) the studies have compared the consistency of treatment schemes determined by WFO and MDT, and (e) there is no limit to whether the article is a prospective or a retrospective study and whether blind methods have used.

The following are the major exclusion criteria:

(a) The studies only describe the simple use of WFO and do not involve any data or only WFO research and development process data, (b) the article does not compare the treatment schemes between WFO and MDT, and (c) book chapter, comment, case reports, and other forms without detailed data.

Data extraction and quality assessment

Two investigators evaluated the quality of the literatures and extracted the data independently. Any disagreements were discussed and consulted by an additional independent arbitrator for further resolution. The lack of original data is supplemented by contacting the original author via e-mail. The data were extracted with a standardized table, including (a) general information, such as the title of the publication, first author’s surname, the original document number and source, year of publication and country, (b) research characteristics, such as the eligibility of the research, the characteristics of the research object, the design scheme and quality of the literature, the design scheme and quality of the literature, the specific contents and implementation methods of the research measures, relevant bias prevention measures, and the main test results; (c) data needed for this meta-analysis, such as the total number of cases in each group, and the number of cases of events were collected by the second classification.

According to the Cochrane Reviewers’ Handbook 6.1 ( ), the quality of the literature was evaluated including 7 aspects: random sequence generation (selection bias), allocation concealment (selection bias), blinding of participants and personnel (performance bias), blinding of outcome assessment (detection bias), incomplete outcome data (attrition bias), selective reporting (reporting bias) and other bias, and the judgment of "yes" (low bias), "no" (high bias) and "unclear" (lack of relevant information or uncertainty of bias) is made respectively. Review Manager statistical software (RevMan, version 5.3.5, Cochrane Collaboration Network) was applied to assess the risk-of-bias and provide visual results.

Statistical analysis

RevMan 5.3.5 was also applied to analyze the extracted data. The main purpose of this study was to compare the consistency of treatment schemes determined by WFO and MDT in different cancer types, so the statistical data were dichotomous data (coincidence or non-coincidence). In the analysis, odds ratios (ORs) and the 95% confidence intervals (CIs) were performed for clinic-pathological features (TNM stage, histopathological category, etc.). Q test or I 2 test was used to judge the heterogeneity among the studies. When P  < 0.05 or I 2  > 50%, there was significant heterogeneity among the studies. On the contrary, there was no heterogeneity. When there was no statistical heterogeneity between studies, the fixed effect model was used to merge the results. If there was statistical heterogeneity, we analyzed the causes of heterogeneity, and adopted subgroup analysis or sensitivity analysis. For the documents that still could not eliminate heterogeneity, the data could be combined from the perspective of clinical significance. Random effect model was adopted for combination analysis, and the results were carefully interpreted. If the data provided could not be meta-analyzed, only descriptive analysis would be done.

Characteristics and quality evaluation of eligible studies

A total of 367 relevant publications from January 2015 to December 2019, were obtained from the preliminary search. There were 237 English literatures (Pubmed: 102, Embase: 106, Cochrane Library: 29) and 130 Chinese literatures (CNKI: 43, CQVIP: 47, CBM: 40). After reading the title, abstract and full text successively, 8 articles 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 and 1 conference abstract 23 were finally included, all of which were Non-RCTs published between 2017 and 2019, 7 studies 15 , 16 , 17 , 19 , 20 , 22 , 23 were published in English, and 2 studies 18 , 21 in Chinese. The basic process of publication selection, the main characteristics and quality evaluation of included publications have been shown in Fig.  1 , Table 1 , Supplementary Fig. 1 , 2 , respectively. Of the 9 studies, 7 studies 15 , 16 , 17 , 19 , 20 , 21 , 22 clearly defined the method of selecting cases, and other studies did not indicate the "randomization" of the included samples. In all studies, WFO and MDT treatment schemes were formulated successively for the same patient in the group, so there was no allocation bias. 7 studies 15 , 16 , 18 , 19 , 20 , 21 , 22 did not indicate specific blind method implementation plan or did not adopt blind method, but the result judgment and measurement will not be affected. Although two studies 16 , 22 did not provide detailed four-category data, they did not completely affect our meta-analysis, so we believed that all studies had no obvious bias in selective reporting results and ensured the basic integrity of the data, but other biases were still unclear. Because it was of little significance to use Begg’s funnel plot and Egger test to detect publication bias when the number of documents was too small (< 10), no publication bias analysis had been performed in this study. Due to the little difference in the quality of the documents included in this meta-analysis, no further sensitivity analysis had been made. After subgroup analysis, most I 2 test results were less than 50%, and there was lower heterogeneity among the studies included in this system evaluation.

figure 1

Flow diagram of the study selection process.

Results of meta-analysis

Overall analysis of consistency between wfo and mdt.

Of the 9 included studies, a total of 7 studies 15 , 17 , 18 , 19 , 20 , 21 , 23 provided four types of complete data (including WFO three types of treatment schemes and unavailable cases) on the consistency of treatment schemes determined by WFO and MDT in different cancer types, involving seven types of cancers including breast cancer, rectal cancer, colon cancer, gastric cancer, lung cancer, ovarian cancer and cervical cancer. Of the 1738 cases included (shown in Supplementary Fig. 3 ), 959 (55.18%) cases were WFO ‘Recommended’ schemes (green schemes) that were consistent with MDT treatment schemes, 503 cases (28.94%) were ‘For consideration’ (orange schemes), and the sum of the two was 1462 cases (84.12%). However, there were 166 cases (9.55%) that were ‘Not recommend’ scheme (pink scheme) and 110 cases (6.33%) that were not supported by WFO (‘Not available’ scheme).

Under the condition that the MDT recommendations were consistent with the ‘Recommended’ or ‘For consideration’ categories of WFO, we conducted meta-analysis according to different clinical stages of patients (stage I–III vs. stage IV). A total of 8 studies 15 , 16 , 17 , 18 , 19 , 20 , 21 , 23 were included in the analysis. Of the 1807 cases included, 1473 (81.52%) WFO treatment schemes were consistent with the MDT. The concordance rate of stage I–III was 86.00% (1026/1193), which was higher than 80.78% (496/614) of stage IV. But the meta-analysis results showed that there was a significant statistical heterogeneity (I 2  = 83%) at different stages, the meta-analysis was conducted using random effect model (shown in Fig.  2 A). The results showed that the difference was not statistically significant, P  = 0.20 [OR 1.68, 95% CI (0.76, 3.74)]. In order to further analyze the consistency between MDT and WFO, we analyzed the situation that only WFO ‘Recommended’ was included but ‘For consideration’ was excluded. A total of 9 studies 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 were included in the analysis. Of the 2463 cases included, 1299 (52.74%) WFO treatment schemes were consistent with MDT. The consistency of stage I–III was 56.46% (962/1704), which was greater than 44.40% (337/759) of stage IV. The meta-analysis results showed that there was significant statistical heterogeneity (I 2  = 90%) in different stages (shown in Fig.  3 A), so we also conducted the meta-analysis using random effect model. The results also showed that the difference was not statistically significant, P  = 0.08 [OR 1.77, 95% CI (0.93, 3.40)]. Meta-analysis showed significant statistical heterogeneity (I 2  > 50%), so subgroup analysis was further adopted according to tumor classification.

figure 2

Forest plot of consistency between WFO (‘Recommended’ or ‘For consideration’) and MDT for patients with various cancers. Treatment was considered concordant if the delivered treatment was rated as either ‘Recommended’ or ‘For consideration’ by WFO and discordant if the delivered treatment was either ‘Not recommended’ by WFO or was physician’s choice (not included in WFO). Overall concordance of various cancers in stages I–III and IV ( A ). Concordance of various estrogen and progesterone receptors (ER+/PR+ vs. ER−, PR−) in breast cancer ( B ). Concordance of various pathological types (small cell vs. non-small cell) in lung cancer ( C ).

figure 3

Forest plot of consistency between WFO (only ‘Recommended’) and MDT for patients with various cancers. Treatment was considered concordant if the delivered treatment was rated as ‘Recommended’ by WFO and discordant if the delivered treatment was rated as other options by WFO or was physician’s choice (not included in WFO). Overall concordance of various cancers in stages I–III and IV ( A ). Concordance of various estrogen and progesterone receptors (ER+/PR+vs. ER−, PR−) in breast cancer ( B ). Concordance of various performance status (ECOG 0 vs. ECOG 1–2) in colorectal cancer ( C ). Concordance of various age (< 70-year-old vs. older) in colorectal cancer ( D ). Concordance of various pathological types (small cell vs. non-small cell) in lung cancer ( E ).

Subgroup analysis of consistency between WFO and MDT

Consistency between wfo (‘recommended’ or ‘for consideration’) and mdt.

Under the condition that the MDT recommendations were consistent with the ‘Recommended’ or ‘For consideration’ categories of WFO, we conducted meta-analysis according to different clinical stages of patients (stage I–III vs. stage IV). The results showed that the consistency of stage I–III was greater than that of stage IV except lung cancer (shown in Table 2 and Fig.  4 ). A total of 3 studies 17 , 20 , 21 (n = 890) were included in our meta-analysis of breast cancer, the results showed that the difference was statistically significant, P  = 0.001 [OR 2.29, 95% CI (1.37, 3.82)]. A total of 4 studies 16 , 17 , 18 , 23 (n = 398) were included in our analysis of colorectal cancer, the results showed that the difference was statistically significant, P  < 0.0001 [OR 3.44, 95% CI (1.91, 6.17)]. A total of 3 studies 17 , 18 , 23 (n = 181) were included in our analysis of colon cancer, the results showed that the difference was statistically significant, P  = 0.04 [OR 2.31, 95% CI (1.06, 5.05)]. A total of 2 studies 17 , 23 (n = 148) were included in our analysis of rectal cancer, the results showed that the difference was not statistically significant, P  = 0.17 [OR 3.31, 95% CI (0.60, 18.25)]. A total of 2 studies 15 , 17 (n = 107) were included in our analysis of gastric cancer, the results showed that the difference was statistically significant, P  = 0.07 [OR 9.81, 95% CI (0.86, 111.5)]. A total of 3 studies 17 , 19 , 23 (n = 374) were included in our analysis of lung cancer, the results showed that the difference was not statistically significant, P  = 0.08 [OR 0.32, 95% CI (0.09, 1.13)].

figure 4

Forest plot of consistency between WFO (‘Recommended’ or ‘For consideration’) and MDT for patients (subgroup).

In addition, a total of 3 studies 17 , 20 , 21 (n = 890) provided data on estrogen and progesterone receptors (ER+/PR+ vs. ER−, PR−) in breast cancer patients, so meta-analysis was further carried out. The results showed (shown in Fig.  2 B) that there was not statistically significant difference, P  = 0.47 [OR 0.85, 95% CI (0.54, 1.34)]. A total 2 of studies 17 , 19 (n = 262) provided data on pathological types (small cell vs. non-small cell) of lung cancer patients. The results showed that the consistency of small cell lung cancer was higher than that of non-small cell lung cancer (shown in Fig.  2 C), and the difference was statistically significant, P  = 0.02 [OR 3, 95% CI (1.20, 7.48)].

Consistency between WFO (only ‘Recommended’) and MDT

Under the condition that the MDT recommendations were consistent with only the ‘Recommended’ categories of WFO, we conducted meta-analysis again according to different clinical stages of patients (stage I–III vs. stage IV). Similarly, the results showed that the consistency of stage I–III was greater than that of stage IV except lung cancer (shown in Table 3 and Fig.  5 ). A total of 3 studies 17 , 20 , 21 (n = 890) were included in our meta-analysis of breast cancer, the results showed that the difference was not statistically significant, P  = 0.37 [OR 1.33, 95% CI (0.72, 2.47)]. A total of 5 studies 16 , 17 , 18 , 22 , 23 (n = 1054) were included in our analysis of colorectal cancer, the results showed that the difference was statistically significant, P  < 0.0001 [OR 3.70, 95% CI (1.93, 7.11)]. A total of 4 studies 17 , 18 , 22 , 23 (n = 837) were included in our analysis of colon cancer, the results showed that the difference was statistically significant, P  = 0.0004 [OR 2.49, 95% CI (1.50, 4.14)]. A total of 2 studies 17 , 23 (n = 148) were included in our analysis of rectal cancer, the results showed that the difference was statistically significant, P  = 0.0001 [OR 5.87, 95% CI (2.36, 14.58)]. A total of 2 studies 15 , 17 (n = 107) were included in our analysis of gastric cancer, the results showed that the difference was statistically significant, P  = 0.01 [OR 3.48, 95% CI (1.28, 9.43)]. A total of 3 studies 17 , 19 , 23 (n = 374) were included in our analysis of lung cancer, the results showed that the difference was not statistically significant, P  = 0.18 [OR 0.36, 95% CI (0.08, 1.57)].

figure 5

Forest plot of consistency between WFO (only ‘Recommended’) and MDT for patients with various cancers in stages I–III and IV (subgroup).

In addition, a total of 3 studies 17 , 20 , 21 (n = 890) provided data on estrogen and progesterone receptors (ER+/PR+ vs. ER−, PR−) in breast cancer patients. The meta-analysis results showed that the consistency of hormone receptor-positive patients (Luminal A and Luminal B) was lower than that of negative patients (HER2 positive and triple negative), and the difference was statistically significant, P  = 0.02 [OR 0.72, 95% CI (0.54, 0.95)] (shown in Fig.  3 B). A total of 2 studies 16 , 22 provided data of different performance status (ECOG 0 vs. ECOG 1–2) and age (< 70-year-old vs. older) of colorectal cancer patients. The results showed that the consistency of ECOG 0 patients was higher than that of ECOG 1–2 patients and the difference was statistically significant, P  = 0.003 [OR 1.59, 95% CI (1.17, 2.17)] (shown in Fig.  3 C); the consistency of patients under 70 years old was higher than that of older, the difference was statistically significant, P  = 0.03 [OR 4.06, 95% CI (1.18, 13.97)] (shown in Fig.  3 D). A total of 2 studies 17 , 19 (n = 262) provided data on pathological types (small cell vs. non-small cell) of lung cancer patients. The results also showed that the consistency of small cell lung cancer was higher than that of non-small cell lung cancer, and the difference was statistically significant, P  < 0.00001 [OR 11.05, 95% CI (4.93, 24.77)] (shown in Fig.  3 E).

Consistency analysis between WFO and MDT

On the whole, it is found that the consistency of stage I–III of other cancers except lung cancer is better than that of stage IV, and most of the results are statistically significant ( P  < 0.05), regardless of setting WFO consistent with MDT at the ‘For consideration’ level (‘Recommended’ or ‘For consideration’) or at the ‘Recommended’ level (only ‘Recommended’). At the ‘For consideration’ level, the overall concordance rate of breast cancer is the highest (88.99%), while that of gastric cancer is the lowest (57.94%). The consistency of small cell lung cancer in patients with lung cancer is higher than that of non-small cell lung cancer, and the difference is statistically significant. At the ‘Recommended’ level, the overall concordance rate of rectal cancer is the highest (81.76%), while that of gastric cancer is still the lowest (29.90%). The consistency of hormone receptor-positive patients (Luminal A and B) of breast cancer is lower than that of hormone receptor-negative patients (HER2 positive and triple negative). In colorectal cancer patients, the consistency of ECOG 0 is higher than that of ECOG 1–2 and under 70 years old is higher than older. However, in lung cancer patients, the consistency of small cell lung cancer is still higher than that of non-small cell lung cancer, and the difference is statistically significant.

Advantages of WFO

Besides showing high consistency with MDT in most cancers, WFO, as an artificial intelligence clinical decision support system also has the following advantages: (a) WFO improves doctors' work efficiency and reduces workload. Hu’s study 18 showed that using WFO can save an average of 8.2 min per case (the average time for obtaining reports is 7.3 ± 2.2 min, and the average time for MDT consultation is 15.5 ± 6.1 min). There is no need to wait for MDT to discuss together helps to reduce the time required to formulate chemotherapy scheme 24 , thus shortening the hospitalization time of patients. (b) WFO can prevent man-made calculation errors. Chemotherapy schemes and drug selection involve complicated and time-consuming processes, and there may be errors in selection 25 , 26 ; it can realize accurate medication through computer programs to prevent such errors 20 , 27 . (c) WFO can improve the quality of doctor-patient communication and prevent doctor-patient disputes. Nowadays, due to a variety of reasons, patients' distrust of doctors is increasing in China 28 , 29 . The more patients participate in the decision-making of their own therapeutic regimen and understand the incidence of adverse events and other information, the more they have confidence in the therapeutic regimen and will cooperate with doctors more actively 30 . (d) WFO can reduce the burden on patients. It can eliminate the time wasted by patients in consultation in various large hospitals, help patients to obtain the more accurate treatment as soon as possible, avoid fatigue caused by transportation, and reduce travel and accommodation costs while avoiding fatigue caused by travel. (e) WFO can improve the professional level of young doctors. It can significantly shorten the time that junior doctors must spend on consulting relevant documents. At the same time, WFO will give reasons for selection, evidence documents and drug use instructions for each scheme, and update the system once every 1–2 months, thus improving the ability of junior doctors to make accurate diagnosis and treatment recommendations in a short time and improving self-confidence.

Disadvantages of WFO

Recent studies showed that the consistency between WFO and MDT for cancer patients is not completely consistent, especially in patients with advanced cancer, there is a significant decrease in consistency. It is confirmed that WFO still has certain limitations, which lead to differences in the consistency rate when the system is applied in other countries. The limitations are shown as follows: (a) Different treatment schemes: yellow and white people have significant differences in sensitivity and tolerance to certain specific chemotherapeutic drugs due to their different constitutions and key enzyme groups of drug metabolism, so that clinical guidelines between different countries and regions must also have certain differences. For example, the mutation rate of EGFR in lung cancer in European and American countries is about 15%, while that in China is more than 50% 31 , 32 . In China, primary research drugs Icotinib and Endostar 33 , 34 , 35 are used to instead of other first-generation epidermal growth factor receptor-tyrosine kinase inhibitor (EGFR-TKI) and bevacizumab, because studies have shown that they are as effective as EGFR-TKI and bevacizumab in lung cancer patients in China 36 , 37 . Liu et al. 19 and others have proposed that if WFO system can provide these two alternative therapeutic regimens in ‘Recommended’ or ‘For consideration’, the overall consistency of lung cancer in China can be increased from 65.8 to 93.2%. Xu et al. 21 also believe that the difference in first-line treatment of advanced breast cancer can also be attributed to the fact that CDK4/6 inhibitors cannot be used because they are not listed in China. Similarly, WFO recommended panizumab targeted therapy in colon cancer patients, but it is not listed in China and patients cannot choose it 38 . (b) Different drug choices: WFO recommended chemotherapy regimen complies with NCCN guidelines, but it also includes thousands of clinical practice cases from MSK 16 . For example, due to the large difference between the surgical methods and guidelines for adjuvant treatment of gastric cancer in China and the United States 39 , 40 , the WFO applied research on gastric cancer in the study shows poor concordance rate. On the contrary, the adjuvant therapy and drug selection for colon cancer in eastern and western countries are more consistent, so the concordance rate between WFO and MDT is obviously increased. Liu et al. 19 also suggested that WFO recommended concurrent chemoradiation during the treatment of lung cancer, whereas China performs sequential chemoradiation (up to 67%). Chinese patients often cannot tolerate concurrent radiotherapy and chemotherapy because their physique is usually weaker than that of western patients. The physique of Chinese patients is usually weaker than that of western patients, which leads to the decrease of coincidence rate between WFO and MDT. (c) Complications: comprehensive treatment for cancer patients is continuous, and patients may suffer from reversible and transient organ function damage. WFO may sometimes exclude some available schemes in the process of selecting the candidate scheme only based on the transient abnormal biochemical results of the patient 41 . In Hu's study 18 , a biochemical blood test of a colon cancer patient showed creatinine clearance rate < 30. WFO did not recommend CapeOX (oxaliplatin + capecitabine) scheme for the patient, but MDT considered that this was only the result of transient biochemical abnormality of the patient, so creatinine clearance rate was rechecked one week later and the result was > 30, so CapeOX scheme treatment was still carried out. In Liu's study 19 , a patient with active pulmonary tuberculosis was also diagnosed as stage III squamous cell lung cancer. If the standard chemoradiotherapy recommended by WFO is accepted, tuberculosis may spread rapidly, resulting in rapid death. Therefore, Liu et al. modified the treatment strategy to oral anti-tuberculosis drugs before radiotherapy and chemotherapy. Therefore, it is believed that if such individualized information can be incorporated into WFO, the coincidence rate between WFO and MDT will be greatly improved. (d) Economic factors: for example, in the treatment of breast cancer, WFO recommends the use of trastuzumab for HER2 positive patients, but patients in China are often forced to choose chemotherapy first due to the high price of this drug 38 . In the Republic of Korea, both WFO and MDT recommend regorafenib for patients with stage IV rectal cancer 42 , but some patients still received 5-fluorouracil (5-Fu)-base chemotherapy, because regorafenib is not only expensive, but also not covered by the national health insurance system 16 . Similarly, China also needs to consider the issue of medical insurance reimbursement, which also affects the consistency between WFO and MDT. If WFO can make targeted improvements to the treatment recommendations for patients with advanced cancer, non-small cell lung cancer, breast cancer with hormone receptor-positive and colorectal cancer with ECOG 1–2 or older (age > 70), it will be more suitable for clinical use in other countries.

Characteristics and limitations of this meta-analysis

Although WFO has been gradually developed in many countries and regions, and the types of cancers supported are also gradually increasing, so far there is still a lack of evidence-based medicine research for this system. In order to understand the consistency between WFO and MDT, WFO advantages and disadvantages in clinical use, and to solve the practical problems encountered in the practical use of the system, we carried out a targeted meta-analysis. Unlike most of the original studies, which only carry out the consistency research at the ‘For consideration’ level (‘Recommended’ or ‘For consideration’) or at the ‘Recommended’ level (only ‘Recommended’), this research respectively carries out meta-analysis of the above two aspects, which further supports some statistical results obtained from the original studies and provides new statistical evidence. It not only reminds clinicians to pay enough attention to patients with advanced cancer, non-small cell lung cancer, Luminal A and B breast cancer and colorectal cancer with ECOG 1–2 or older (age > 70) in the future when using WFO, but also provides clinical evidence for improvement of WFO. Of course, this meta-analysis still has certain limitations, which are mainly manifested in the following aspects: (a) The possibility of selection bias may exist in a few included studies; (b) The number of samples included in some studies is relatively small, and some study results are not fully reported, lacking complete data of the four classifications. (3) Most studies did not mention the relevant data of WFO's advantages such as shortening consultation time and coincidence between junior or senior doctors and WFO, which leads us to fail to further analyze some of WFO's advantages. (d) All data are published research or conference summaries, lack of grey literature, and possible literature selectivity bias. In addition, 182 cases were included in the initial stage in Liu's study on lung cancer 19 . In the further study, a total of 33 cases were excluded from the study without the support of WFO, and the remaining 149 patients were included in the study. However, the clinical stages of these 33 cases are not listed in detail and cannot be included for further Meta-analysis. Moreover, the distribution of patients in this study is unbalanced, that is, there are fewer patients in early stage, which is obviously different from the situation that there are more early-stage patients than late-stage patients in other cancers. All these may lead to different conclusions about lung cancer from other cancers. Of course, the sample size included in our systematic evaluation is small, so larger sample size, multi-center and high-quality randomized controlled trials are still needed for further verification in order to reach more reliable conclusions.

To sum up, we should regard WFO as "a tool, not a crutch" 43 . If WFO is properly used, it will be regarded as a valuable tool. Proper use requires WFO to be only in the position of a complement to the doctor's work, instead of relying on it completely. Oncologists can integrate it with traditional resources such as colleagues' experience and scientific journals to choose the most effective method to provide chemotherapy schemes for patients, to help patients obtain more accurate and effective treatment, fasten and improve their treatment results. Of course, WFO should also make continuous improvement according to clinical use in other countries. People often say that AI will change medicine. In fact, through examples like WFO, we can look forward to how AI can enable people all over the world to obtain the best quality medical services fairly, no matter where or who the patients are 44 .

Denu, R. A. et al. Influence of patient, physician, and hospital characteristics on the receipt of guideline-concordant care for inflammatory breast cancer. Cancer Epidemiol. 40 , 7–14. (2016).

Article   PubMed   Google Scholar  

Woolhandler, S. & Himmelstein, D. U. Administrative work consumes one-sixth of U.S. physicians’ working hours and lowers their career satisfaction. Int. J. Health Serv. 44 (4), 635–642. (2014).

American Society of Clinical Oncology. The state of cancer care in America, 2016: A report by the American Society of Clinical Oncology. J. Oncol. Pract. 12 (4), 339–383 (2016).

Article   Google Scholar  

Yu, P., Artz, D. & Warner, J. Electronic health records (EHRs): Supporting ASCO’s vision of cancer care. Am. Soc. Clin. Oncol. Educ. Book 2014 , 225–231. (2014).

Castaneda, C. et al. Clinical decision support systems for improving diagnostic accuracy and achieving precision medicine. J. Clin. Bioinform. 5 , 4. (2015).

Musib, M. et al. Artificial intelligence in research. Science 357 (6346), 28–30. (2017).

Article   ADS   PubMed   Google Scholar  

Spangler, S. et al. Automated Hypothesis Generation Based on Mining Scientific Literature: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA 2014 , 1877–1886. (2014).

Dayarian, A. et al. Predicting protein phosphorylation from gene expression: Top methods from the IMPROVER Species Translation Challenge. Bioinformatics 31 (4), 462–470. (2015).

Article   CAS   PubMed   Google Scholar  

Codella, N. et al. Deep learning, sparse coding, and SVM for melanoma recognition in dermoscopy images. Mach. Learn. Med. Imaging 2015 , 118–126. (2015).

Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316 (22), 2402–2410. (2016).

Malek, M. et al. A machine learning approach for distinguishing uterine sarcoma from leiomyomas based on perfusion weighted MRI parameters. Eur. J. Radiol. 110 , 203–211. (2019).

Kawakami, E. et al. Application of artificial intelligence for preoperative diagnostic and prognostic prediction in epithelial ovarian cancer based on blood biomarkers. Clin. Cancer Res. 25 (10), 3006–3015. (2019).

Li, S. et al. A DNA nanorobot functions as a cancer therapeutic in response to a molecular trigger in vivo. Nat. Biotechnol. 36 (3), 258–264. (2018).

Lu, H. N. et al. A mathematical-descriptor of tumor-mesoscopic-structure from computed-tomography images annotates prognostic- and molecular-phenotypes of epithelial ovarian cancer. Nat. Commun. 10 (1), 764. (2019).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Choi, Y. I. et al. Concordance rate between clinicians and Watson for Oncology among patients with advanced gastric cancer: Early, real-world experience in Korea. Can. J. Gastroenterol. Hepatol. 2019 , 8072928. (2019).

Article   PubMed   PubMed Central   Google Scholar  

Kim, E. J. et al. Early experience with Watson for oncology in Korean patients with colorectal cancer. PLoS ONE 14 (3), e0213640. (2019).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Zhou, N. et al. Concordance study between IBM Watson for Oncology and clinical practice for patients with cancer in China. Oncologist 24 (6), 812–819. (2019).

Hu, C. L. et al. The application value of Watson for oncology in patients with colon cancer. Chin. J. Front. Med. Sci. (Electronic Version) 10 (10), 116–120. (2018).

Liu, C. et al. Using artificial intelligence (Watson for Oncology) for treatment recommendations amongst Chinese patients with lung cancer: Feasibility study. J. Med. Internet Res. 20 (9), e11087. (2018).

Somashekhar, S. P. et al. Watson for Oncology and breast cancer treatment recommendations: Agreement with an expert multidisciplinary tumor board. Ann. Oncol. 29 (2), 418–423. (2018).

Xu, J. N., Jiang, Y. J., Duan, Y. Y., Hua, S. Y. & Sun, T. Application of Watson for Oncology on therapy in patients with breast cancer. J. Chin. Res. Hosp. 3 , 19–24. (2018).

Lee, W. S. et al. Assessing concordance with Watson for Oncology, a cognitive computing decision support system for colon cancer treatment in Korea. JCO Clin. Cancer Inform. 2 , 1–8. (2018).

Somashekhar, S. P. et al. Early experience with IBM Watson for Oncology (WFO) cognitive computing system for lung and colorectal cancer treatment. In Journal of clinical oncology, Conference: 2017 annual meeting of the american society of clinical oncology, ASCO. United States 35 (15 Supplement 1) (2017).

Printz, C. Artificial intelligence platform for oncology could assist in treatment decisions. Cancer 123 (6), 905. (2017).

Murphy, E. V. Clinical decision support: Effectiveness in improving quality processes and clinical outcomes and factors that may influence success. Yale J. Biol. Med. 87 (2), 187–197 (2014).

PubMed   PubMed Central   Google Scholar  

Keiffer, M. R. Utilization of clinical practice guidelines: Barriers and facilitators. Nurs. Clin. N. Am. 50 (2), 327–345. (2015).

Svenstrup, D., Jørgensen, H. L. & Winther, O. Rare disease diagnosis: A review of web search, social media and large-scale datamining approaches. Rare Dis. 3 (1), e1083145. (2015).

Zhou, M., Zhao, L., Campy, K. S. & Wang, S. Changing of China’s health policy and doctor-patient relationship: 1949–2016. Health Policy Technol. 6 (3), 358–367. (2017).

Chan, C. S. Mistrust of physicians in China: Society, institution, and interaction as root causes. Dev. World Bioeth. 18 (1), 16–25. (2018).

Fang, J. M. et al. The establishment of a new medical model for tumor treatment combined with Watson for Oncology, MDT and patient involvement. J. Clin. Oncol. 36 (15 suppl), e18504. (2018).

Li, T., Kung, H. J., Mack, P. C. & Gandara, D. R. Genotyping and genomic profiling of non-small-cell lung cancer: Implications for current and future therapies. J. Clin. Oncol. 31 (8), 1039–1049. (2013).

Zhou, C. Lung cancer molecular epidemiology in China: Recent trends. Transl. Lung Cancer Res. 3 (5), 270–279. (2014).

Lu, S. et al. A multicenter, open-label, randomized phase II controlled study of rh-endostatin (Endostar) in combination with chemotherapy in previously untreated extensive-stage small-cell lung cancer. J. Thorac. Oncol. 10 (1), 206–211. (2015).

Sun, Y. et al. Endostar Phase III NSCLC Study Group. Long-term results of a randomized, double-blind, and placebo-controlled phase III trial: Endostar (rh-endostatin) versus placebo in combination with vinorelbine and cisplatin in advanced non-small cell lung cancer. Thorac. Cancer 4 (4), 440–448. (2013).

Wang, J., Gu, L. J., Fu, C. X., Cao, Z. & Chen, Q. Y. Endostar combined with chemotherapy compared with chemotherapy alone in the treatment of nonsmall lung carcinoma: A meta-analysis based on Chinese patients. Indian J. Cancer 51 (Suppl 3), e106–e109. (2014).

Grigoriu, B., Berghmans, T. & Meert, A. P. Management of EGFR mutated nonsmall cell lung carcinoma patients. Eur. Respir. J. 45 (4), 1132–1141. (2015).

Shi, Y. et al. Icotinib versus gefitinib in previously treated advanced non-small-cell lung cancer (ICOGEN): A randomized, double-blind phase 3 non-inferiority trial. Lancet Oncol. 14 (10), 953–961. (2013).

Zhou, N., Li, A. Q., Liu, G. W., Zhang, G. Q. & Zhang, X. C. Clinical application of artificial intelligence-Watson for Oncology. China Digit. Med. 13 (10), 23–25 (2018).

Google Scholar  

Zhou, J. & Fan, Y. Z. Different methods of alimentary tract reconstruction after gastrectomy. Surg. Res. New Tech. 4 (4), 270–277 (2015).

Strong, V. E. et al. Comparison of young patients with gastric cancer in the United States and China. Ann. Surg. Oncol. 24 (13), 3964–3971. (2017).

Wang, C. F. Discussion on the comprehensive treatment and prevention of cancer. World Latest Med. Inf. 18 (35), 180–183. (2018).

Grothey, A. et al. Regorafenib monotherapy for previously treated metastatic colorectal cancer (CORRECT): An international, multicentre, randomised, placebo-controlled, phase 3 trial. Lancet 381 (9863), 303–312. (2013).

Hamilton, J. G. et al. “A Tool, Not a Crutch”: Patient perspectives about IBM Watson for Oncology trained by memorial sloan kettering. J. Oncol. Pract. 15 (4), e277–e288 (2019).

Krittanawong, C., Zhang, H. J., Wang, Z., Aydar, M. & Kitai, T. Artificial intelligence in precision cardiovascular medicine. J. Am. Coll. Cardiol. 69 (21), 2657–2664. (2017).

Download references

Scientific Research and Technology Development Program of Guangxi (NO. Guike 14124004) and the Natural Science Foundation of Guangxi (NO. GXNSFAA118147).

Author information

These authors contributed equally: Zhou Jie and Zeng Zhiying.

Authors and Affiliations

Department of Gynecologic Oncology, Guangxi Medical University Cancer Hospital, Key Laboratory of Early Prevention and Treatment for Regional High Frequency Tumor, Ministry of Education, Nanning, 530021, Guangxi, People’s Republic of China

Zhou Jie & Li Li

Department of Gynecology, The Second Affiliated Hospital, University of South China, Hengyang, 421001, Hunan, People’s Republic of China

Department of Anesthesiology, The Second Affiliated Hospital, University of South China, Hengyang, 421001, Hunan, People’s Republic of China

Zeng Zhiying

You can also search for this author in PubMed   Google Scholar


Conceptualization, L.L. and Z.J.; software, Z.Z.; validation, L.L. and Z.J.; investigation, Z.J.; resources, Z.J.; data curation, Z.Z.; writing—original draft preparation, Z.J.; writing—review and editing, L.L.; visualization, L.L.; supervision, L.L.; project administration, L.L.; funding acquisition, L.L. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Li Li .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary figure 1., supplementary figure 2., supplementary figure 3., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit .

Reprints and permissions

About this article

Cite this article.

Jie, Z., Zhiying, Z. & Li, L. A meta-analysis of Watson for Oncology in clinical application. Sci Rep 11 , 5792 (2021).

Download citation

Received : 21 July 2020

Accepted : 25 November 2020

Published : 11 March 2021


Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

The fidelity of artificial intelligence to multidisciplinary tumor board recommendations for patients with gastric cancer: a retrospective study.

  • Yong-Eun Park
  • Hyundong Chae

Journal of Gastrointestinal Cancer (2024)

AI and the need for justification (to the patient)

  • Anantharaman Muralidharan
  • Julian Savulescu
  • G. Owen Schaefer

Ethics and Information Technology (2024)

The Unseen Hand: AI-Based Prescribing Decision Support Tools and the Evaluation of Drug Safety and Effectiveness

  • Harriet Dickinson
  • Dana Y. Teltsch
  • Juan M. Hincapie-Castillo

Drug Safety (2024)

Machine learning-based clinical decision support system for treatment recommendation and overall survival prediction of hepatocellular carcinoma: a multi-center study

  • Kyung Hwa Lee
  • Gwang Hyeon Choi
  • Kang Mo Kim

npj Digital Medicine (2024)

Thermal immuno-nanomedicine in cancer

  • Xingcai Zhang

Nature Reviews Clinical Oncology (2023)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

ibm watson research articles

For IEEE Members

Ieee spectrum, follow ieee spectrum, support ieee spectrum, enjoy more free content and benefits by creating an account, saving articles to read later requires an ieee spectrum account, the institute content is only available for members, downloading full pdf issues is exclusive for ieee members, downloading this e-book is exclusive for ieee members, access to spectrum 's digital edition is exclusive for ieee members, following topics is a feature exclusive for ieee members, adding your response to an article requires an ieee spectrum account, create an account to access more content and features on ieee spectrum , including the ability to save articles to read later, download spectrum collections, and participate in conversations with readers and editors. for more exclusive content and features, consider joining ieee ., join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of spectrum’s articles, archives, pdf downloads, and other benefits. learn more →, join the world’s largest professional organization devoted to engineering and applied sciences and get access to this e-book plus all of ieee spectrum’s articles, archives, pdf downloads, and other benefits. learn more →, access thousands of articles — completely free, create an account and get exclusive content and features: save articles, download collections, and talk to tech insiders — all free for full access and benefits, join ieee as a paying member., how ibm watson overpromised and underdelivered on ai health care, after its triumph on jeopardy, ibm’s ai seemed poised to revolutionize medicine. doctors are still waiting.

Conceptual photo-illustration imagining IBM’s AI Watson as a concerned doctor, with the Watson logo standing in for the doctor’s face.

In 2014, IBM opened swanky new headquarters for its artificial intelligence division, known as IBM Watson . Inside the glassy tower in lower Manhattan, IBMers can bring prospective clients and visiting journalists into the “immersion room,” which resembles a miniature planetarium. There, in the darkened space, visitors sit on swiveling stools while fancy graphics flash around the curved screens covering the walls. It’s the closest you can get, IBMers sometimes say, to being inside Watson’s electronic brain.

One dazzling 2014 demonstration of Watson’s brainpower showed off its potential to transform medicine using AI—a goal that IBM CEO Virginia Rometty often calls the company’s moon shot. In the demo, Watson took a bizarre collection of patient symptoms and came up with a list of possible diagnoses, each annotated with Watson’s confidence level and links to supporting medical literature.

Within the comfortable confines of the dome, Watson never failed to impress: Its memory banks held knowledge of every rare disease, and its processors weren’t susceptible to the kind of cognitive bias that can throw off doctors. It could crack a tough case in mere seconds. If Watson could bring that instant expertise to hospitals and clinics all around the world, it seemed possible that the AI could reduce diagnosis errors, optimize treatments, and even alleviate doctor shortages—not by replacing doctors but by helping them do their jobs faster and better.

Project: Oncology Expert Advisor

MD Anderson Cancer Center partnered with IBM Watson to create an advisory tool for oncologists. The tool used natural-language processing (NLP) to summarize patients’ electronic health records, then searched databases to provide treatment recommendations. Physicians tried out a prototype in the leukemia department, but MD Anderson canceled the project in 2016—after spending US $62 million on it.

Outside of corporate headquarters, however, IBM has discovered that its powerful technology is no match for the messy reality of today’s health care system. And in trying to apply Watson to cancer treatment, one of medicine’s biggest challenges, IBM encountered a fundamental mismatch between the way machines learn and the way doctors work.

IBM’s bold attempt to revolutionize health care began in 2011. The day after Watson thoroughly defeated two human champions in the game of Jeopardy! , IBM announced a new career path for its AI quiz-show winner: It would become an AI doctor. IBM would take the breakthrough technology it showed off on television—mainly, the ability to understand natural language—and apply it to medicine. Watson’s first commercial offerings for health care would be available in 18 to 24 months, the company promised.

In fact, the projects that IBM announced that first day did not yield commercial products. In the eight years since, IBM has trumpeted many more high-profile efforts to develop AI-powered medical technology—many of which have fizzled, and a few of which have failed spectacularly. The company spent billions on acquisitions to bolster its internal efforts, but insiders say the acquired companies haven’t yet contributed much . And the products that have emerged from IBM’s Watson Health division are nothing like the brilliant AI doctor that was once envisioned: They’re more like AI assistants that can perform certain routine tasks.

“Reputationally, I think they’re in some trouble,” says Robert Wachter , chair of the department of medicine at the University of California, San Francisco, and author of the 2015 book The Digital Doctor: Hope, Hype, and Harm at the Dawn of Medicine’s Computer Age (McGraw-Hill). In part, he says, IBM is suffering from its ambition: It was the first company to make a major push to bring AI to the clinic. But it also earned ill will and skepticism by boasting of Watson’s abilities. “They came in with marketing first, product second, and got everybody excited,” he says. “Then the rubber hit the road. This is an incredibly hard set of problems, and IBM, by being first out, has demonstrated that for everyone else.”

Since 2011, IBM Watson has announced a multitude of projects in health care.

How have they fared?

At a 2017 conference of health IT professionals, IBM CEO Rometty told the crowd that AI “is real, it’s mainstream, it’s here, and it can change almost everything about health care,” and added that it could usher in a medical “golden age.” She’s not alone in seeing an opportunity: Experts in computer science and medicine alike agree that AI has the potential to transform the health care industry. Yet so far, that potential has primarily been demonstrated in carefully controlled experiments. Only a few AI-based tools have been approved by regulators for use in real hospitals and doctors’ offices. Those pioneering products work mostly in the visual realm, using computer vision to analyze images like X-rays and retina scans. (IBM does not have a product that analyzes medical images, though it has an active research project in that area.)

Looking beyond images, however, even today’s best AI struggles to make sense of complex medical information. And encoding a human doctor’s expertise in software turns out to be a very tricky proposition. IBM has learned these painful lessons in the marketplace, as the world watched. While the company isn’t giving up on its moon shot, its launch failures have shown technologists and physicians alike just how difficult it is to build an AI doctor.

The Jeo par dy! victory in 2011 showed Watson’s remarkable skill with natural-language processing (NLP). To play the game, it had to parse complicated clues full of wordplay, search massive textual databases to find possible answers, and determine the best one. Watson wasn’t a glorified search engine; it didn’t just return documents based on keywords. Instead it employed hundreds of algorithms to map the “entities” in a sentence and understand the relationships among them. It used this skill to make sense of both the Jeopardy! clue and the millions of text sources it mined.

Project: Cognitive Coaching System

The sportswear company Under Armour teamed up with Watson Health to create a “personal health trainer and tness consultant.” Using data from Under Armour’s activity-tracker app, the Cognitive Coach was intended to provide customized training programs based on a user’s habits, as well as advice based on analysis of outcomes achieved by similar people. The coach never launched, and Under Armour is no longer working with IBM Watson.

“It almost seemed that Watson could understand the meaning of language, rather than just recognizing patterns of words,” says Martin Kohn , who was the chief medical scientist for IBM Research at the time of the Jeopardy! match. “It was an order of magnitude more powerful than what existed.” What’s more, Watson developed this ability on its own, via machine learning. The IBM researchers trained Watson by giving it thousands of Jeopardy! clues and responses that were labeled as correct or incorrect. In this complex data set, the AI discovered patterns and made a model for how to get from an input (a clue) to an output (a correct response).

Long before Watson starred on the Jeopardy! stage, IBM had considered its possibilities for health care. Medicine, with its reams of patient data, seemed an obvious fit, particularly as hospitals and doctors were switching over to electronic health records. While some of that data can be easily digested by machines, such as lab results and vital-sign measurements, the bulk of it is “unstructured” information, such as doctor’s notes and hospital discharge summaries. That narrative text accounts for about 80 percent of a typical patient’s record—and it’s a stew of jargon, shorthand, and subjective statements.

Kohn, who came to IBM with a medical degree from Harvard University and an engineering degree from MIT, was excited to help Watson tackle the language of medicine. “It seemed like Watson had the potential to overcome those complexities,” he says. By turning its mighty NLP abilities to medicine, the theory went, Watson could read patients’ health records as well as the entire corpus of medical literature: textbooks, peer-reviewed journal articles, lists of approved drugs, and so on. With access to all this data, Watson might become a superdoctor, discerning patterns that no human could ever spot.

“Doctors go to work every day—especially the people on the front lines, the primary care doctors—with the understanding that they cannot possibly know everything they need to know in order to practice the best, most efficient, most effective medicine possible,” says Herbert Chase , a professor of medicine and biomedical informatics at Columbia University who collaborated with IBM in its first health care efforts. But Watson, he says, could keep up—and if turned into a tool for “clinical decision support,” it could enable doctors to keep up, too. In lieu of a Jeopardy! clue, a physician could give Watson a patient’s case history and ask for a diagnosis or optimal treatment plan.

Chase worked with IBM researchers on the prototype for a diagnostic tool, the thing that dazzled visitors in the Watson immersion room. But IBM chose not to commercialize it, and Chase parted ways with IBM in 2014. He’s disappointed with Watson’s slow progress in medicine since then. “I’m not aware of any spectacular home runs,” he says.

He’s one of many early Watson enthusiasts who are now dismayed. Eliot Siegel , a professor of radiology and vice chair of information systems at the University of Maryland, also collaborated with IBM on the diagnostic research. While he thinks AI-enabled tools will be indispensable to doctors within a decade, he’s not confident that IBM will build them. “I don’t think they’re on the cutting edge of AI,” says Siegel. “The most exciting things are going on at Google , Apple, and Amazon.”

As for Kohn, who left IBM in 2014, he says the company fell into a common trap: “Merely proving that you have powerful technology is not sufficient,” he says. “Prove to me that it will actually do something useful—that it will make my life better, and my patients’ lives better.” Kohn says he’s been waiting to see peer-reviewed papers in the medical journals demonstrating that AI can improve patient outcomes and save health systems money. “To date there’s very little in the way of such publications,” he says, “and none of consequence for Watson.”

AI’s First Foray Into Health Care

Doctors are a conservative bunch—for good reason—and slow to adopt new technologies. But in some areas of health care, medical professionals are beginning to see artificially intelligent systems as reliable and helpful. Here are a few early steps toward AI medicine.

In trying to bring AI into the clinic, IBM was taking on an enormous technical challenge. But having fallen behind tech giants like Google and Apple in many other computing realms, IBM needed something big to stay relevant. In 2014, the company invested US $1 billion in its Watson unit , which was developing tech for multiple business sectors. In 2015, IBM announced the formation of a special Watson Health division , and by mid-2016 Watson Health had acquired four health-data companies for a total cost of about $4 billion. It seemed that IBM had the technology, the resources, and the commitment necessary to make AI work in health care.

Today, IBM’s leaders talk about the Watson Health effort as “a journey” down a road with many twists and turns. “It’s a difficult task to inject AI into health care, and it’s a challenge. But we’re doing it,” says John E. Kelly III, IBM senior vice president for cognitive solutions and IBM research. Kelly has guided the Watson effort since the Jeopardy! days, and in late 2018 he also assumed direct oversight of Watson Health. He says the company has pivoted when it needs to: “We’re continuing to learn, so our offerings change as we learn.”

Project: Sugar.IQ

Medtronic and Watson Health began working together in 2015 on an app for personalized diabetes management. The app works with data from Medtronic’s continuous glucose monitor, and helps diabetes patients track how their medications, food, and lifestyle choices affect their glucose levels. The FDA-approved app launched in 2018.

The diagnostic tool, for example, wasn’t brought to market because the business case wasn’t there, says Ajay Royyuru , IBM’s vice president of health care and life sciences research. “Diagnosis is not the place to go,” he says. “That’s something the experts do pretty well. It’s a hard task, and no matter how well you do it with AI, it’s not going to displace the expert practitioner.” (Not everyone agrees with Royyuru: A 2015 report on diagnostic error s from the National Academies of Sciences, Engineering, and Medicine stated that improving diagnoses represents a “moral, professional, and public health imperative.”)

In an attempt to find the business case for medical AI, IBM pursued a dizzying number of projects targeted to all the different players in the health care system: physicians, administrative staff, insurers, and patients. What ties all the threads together, says Kelly, is an effort to provide “decision support using AI [that analyzes] massive data sets.” IBM’s most publicized project focused on oncology, where it hoped to deploy Watson’s “cognitive” abilities to turn big data into personalized cancer treatments for patients.

In many attempted applications, Watson’s NLP struggled to make sense of medical text—as have many other AI systems. “We’re doing incredibly better with NLP than we were five years ago, yet we’re still incredibly worse than humans,” says Yoshua Bengio , a professor of computer science at the University of Montreal and a leading AI researcher. In medical text documents, Bengio says, AI systems can’t understand ambiguity and don’t pick up on subtle clues that a human doctor would notice. Bengio says current NLP technology can help the health care system: “It doesn’t have to have full understanding to do something incredibly useful,” he says. But no AI built so far can match a human doctor’s comprehension and insight. “No, we’re not there,” he says.

IBM’s work on cancer serves as the prime example of the challenges the company encountered. “I don’t think anybody had any idea it would take this long or be this complicated,” says Mark Kris , a lung cancer specialist at Memorial Sloan Kettering Cancer Center, in New York City, who has led his institution’s collaboration with IBM Watson since 2012.

The effort to improve cancer care had two main tracks. Kris and other preeminent physicians at Sloan Kettering trained an AI system that became the product Watson for Oncology in 2015. Across the country, preeminent physicians at the University of Texas MD Anderson Cancer Center, in Houston, collaborated with IBM to create a different tool called Oncology Expert Advisor. MD Anderson got as far as testing the tool in the leukemia department, but it never became a commercial product.

Both efforts have received strong criticism. One excoriating article about Watson for Oncology alleged that it provided useless and sometimes dangerous recommendations (IBM contests these allegations ). More broadly, Kris says he has often heard the critique that the product isn’t “real AI.” And the MD Anderson project failed dramatically: A 2016 audit by the University of Texas found that the cancer center spent $62 million on the project before canceling it. A deeper look at these two projects reveals a fundamental mismatch between the promise of machine learning and the reality of medical care—between “real AI” and the requirements of a functional product for today’s doctors.

Watson for Oncology was supposed to learn by ingesting the vast medical literature on cancer and the health records of real cancer patients. The hope was that Watson, with its mighty computing power, would examine hundreds of variables in these records—including demographics, tumor characteristics, treatments, and outcomes—and discover patterns invisible to humans. It would also keep up to date with the bevy of journal articles about cancer treatments being published every day. To Sloan Kettering’s oncologists, it sounded like a potential breakthrough in cancer care. To IBM, it sounded like a great product. “I don’t think anybody knew what we were in for,” says Kris.

Watson learned fairly quickly how to scan articles about clinical studies and determine the basic outcomes. But it proved impossible to teach Watson to read the articles the way a doctor would. “The information that physicians extract from an article, that they use to change their care, may not be the major point of the study,” Kris says. Watson’s thinking is based on statistics, so all it can do is gather statistics about main outcomes, explains Kris. “But doctors don’t work that way.”

In 2018, for example, the FDA approved a new “tissue agnostic” cancer drug that is effective against all tumors that exhibit a specific genetic mutation. The drug was fast-tracked based on dramatic results in just 55 patients, of whom four had lung cancer. “We’re now saying that every patient with lung cancer should be tested for this gene,” Kris says. “All the prior guidelines have been thrown out, based on four patients.” But Watson won’t change its conclusions based on just four patients. To solve this problem, the Sloan Kettering experts created “synthetic cases” that Watson could learn from, essentially make-believe patients with certain demographic profiles and cancer characteristics. “I believe in analytics; I believe it can uncover things,” says Kris. “But when it comes to cancer, it really doesn’t work.”

Do You Agree?

Several studies have compared Watson for Oncology’s cancer treatment recommendations to those of hospital oncologists. The concordance percentages indicate how often Watson’s advice matched the experts’ treatment plans.

The realization that Watson couldn’t independently extract insights from breaking news in the medical literature was just the first strike. Researchers also found that it couldn’t mine information from patients’ electronic health records as they’d expected.

At MD Anderson, researchers put Watson to work on leukemia patients’ health records—and quickly discovered how tough those records were to work with. Yes, Watson had phenomenal NLP skills. But in these records, data might be missing, written down in an ambiguous way, or out of chronological order. In a 2018 paper published in The Oncologist , the team reported that its Watson-powered Oncology Expert Advisor had variable success in extracting information from text documents in medical records. It had accuracy scores ranging from 90 to 96 percent when dealing with clear concepts like diagnosis, but scores of only 63 to 65 percent for time-dependent information like therapy timelines.

In a final blow to the dream of an AI superdoctor, researchers realized that Watson can’t compare a new patient with the universe of cancer patients who have come before to discover hidden patterns. Both Sloan Kettering and MD Anderson hoped that the AI would mimic the abilities of their expert oncologists, who draw on their experience of patients, treatments, and outcomes when they devise a strategy for a new patient. A machine that could do the same type of population analysis—more rigorously, and using thousands more patients—would be hugely powerful.

But the health care system’s current standards don’t encourage such real-world learning. MD Anderson’s Oncology Expert Advisor issued only “evidence based” recommendations linked to official medical guidelines and the outcomes of studies published in the medical literature. If an AI system were to base its advice on patterns it discovered in medical records—for example, that a certain type of patient does better on a certain drug—its recommendations wouldn’t be considered evidence based, the gold standard in medicine. Without the strict controls of a scientific study, such a finding would be considered only correlation, not causation.

Kohn, formerly of IBM, and many others think the standards of health care must change in order for AI to realize its full potential and transform medicine. “The gold standard is not really gold,” Kohn says. AI systems could consider many more factors than will ever be represented in a clinical trial, and could sort patients into many more categories to provide “truly personalized care,” Kohn says. Infrastructure must change too: Health care institutions must agree to share their proprietary and privacy-controlled data so AI systems can learn from millions of patients followed over many years.

According to anecdotal reports , IBM has had trouble finding buyers for its Watson oncology product in the United States. Some oncologists say they trust their own judgment and don’t need Watson telling them what to do. Others say it suggests only standard treatments that they’re well aware of. But Kris says some physicians are finding it useful as an instant second opinion that they can share with nervous patients. “As imperfect as it is, and limited as it is, it’s very helpful,” Kris says. IBM sales reps have had more luck outside the United States, with hospitals in India, South Korea, Thailand, and beyond adopting the technology. Many of these hospitals proudly use the IBM Watson brand in their marketing, telling patients that they’ll be getting AI-powered cancer care.

In the past few years, these hospitals have begun publishing studies about their experiences with Watson for Oncology. In India, physicians at the Manipal Comprehensive Cancer Center evaluated Watson on 638 breast cancer cases and found a 73 percent concordance rate in treatment recommendations; its score was brought down by poor performance on metastatic breast cancer. Watson fared worse at Gachon University Gil Medical Center, in South Korea, where its top recommendations for 656 colon cancer patients matched those of the experts only 49 percent of the time. Doctors reported that Watson did poorly with older patients, didn’t suggest certain standard drugs, and had a bug that caused it to recommend surveillance instead of aggressive treatment for certain patients with metastatic cancer.

These studies aimed to determine whether Watson for Oncology’s technology performs as expected. But no study has yet shown that it benefits patients. Wachter of UCSF says that’s a growing problem for the company: “IBM knew that the win on Jeopardy! and the partnership with Memorial Sloan Kettering would get them in the door. But they needed to show, fairly quickly, an impact on hard outcomes.” Wachter says IBM must convince hospitals that the system is worth the financial investment. “It’s really important that they come out with successes,” he says. “Success is an article in the New England Journal of Medicine showing that when we used Watson, patients did better or we saved money.” Wachter is still waiting to see such articles appear.

Sloan Kettering’s Kris isn’t discouraged; he says the technology will only get better. “As a tool, Watson has extraordinary potential,” he says. “I do hope that the people who have the brainpower and computer power stick with it. It’s a long haul, but it’s worth it.”

Some success stories are emerging from Watson Health—in certain narrow and controlled applications, Watson seems to be adding value. Take, for example, the Watson for Genomics product, which was developed in partnership with the University of North Carolina, Yale University, and other institutions. The tool is used by genetics labs that generate reports for practicing oncologists: Watson takes in the file that lists a patient’s genetic mutations, and in just a few minutes it can generate a report that describes all the relevant drugs and clinical trials. “We enable the labs to scale,” says Vanessa Michelini , an IBM Distinguished Engineer who led the development and 2016 launch of the product.

Watson has a relatively easy time with genetic information, which is presented in structured files and has no ambiguity—either a mutation is there, or it’s not. The tool doesn’t employ NLP to mine medical records, instead using it only to search textbooks, journal articles, drug approvals, and clinical trial announcements, where it looks for very specific statements.

IBM’s partners at the University of North Carolina published the first paper about the effectiveness   of Watson for Genomics in 2017. For 32 percent of cancer patients enrolled in that study, Watson spotted potentially important mutations not identified by a human review, which made these patients good candidates for a new drug or a just-opened clinical trial. But there’s no indication, as of yet, that Watson for Genomics leads to better outcomes.

The U.S. Department of Veterans Affairs uses Watson for Genomics reports in more than 70 hospitals nationwide, says Michael Kelley , the VA’s national program director for oncology. The VA first tried the system on lung cancer and now uses it for all solid tumors. “I do think it improves patient care,” Kelley says. When VA oncologists are deciding on a treatment plan, “it is a source of information they can bring to the discussion,” he says. But Kelley says he doesn’t think of Watson as a robot doctor. “I tend to think of it as a robot who is a master medical librarian.”

Most doctors would probably be delighted to have an AI librarian at their beck and call—and if that’s what IBM had originally promised them, they might not be so disappointed today. The Watson Health story is a cautionary tale of hubris and hype. Everyone likes ambition, everyone likes moon shots, but nobody wants to climb into a rocket that doesn’t work.

So Far, Few Successes

IBM began its effort to bring Watson into the health care industry in 2011. Since then, the company has made nearly 50 announcements about partnerships that were intended to develop new AI-enabled tools for medicine. Some collaborations worked on tools for doctors and institutions; some worked on consumer apps. While many of these alliances have not yet led to commercial products, IBM says the research efforts have been valuable, and that many relationships are ongoing. Here’s a representative sample of projects.

This article appears in the April 2019 print issue as “IBM Watson, Heal Thyself.”

  • IBM's Watson Tries to Learn…Everything - IEEE Spectrum ›
  • IBM Watson's Next Challenge: Modernize Legacy Code - IEEE ... ›
  • Tomorrow's AI Will Reason Like Humans, IBM Watson Developer ... ›
  • Layoffs at Watson Health Reveal IBM's Problem With AI - IEEE ... ›
  • Paige's AI Diagnostic Tech is Revolutionizing Cancer Diagnosis - IEEE Spectrum ›
  • 7 Revealing Ways AIs Fail ›
  • AI Could Make Air Conditioners 10x Better - IEEE Spectrum ›
  • Is AI Good for Health Care? - IEEE Spectrum ›
  • AI Can Offer Insight into Who Responds to Anti-Depressants - IEEE Spectrum ›
  • We've Entered a New Era of Streaming Health Care. Now What? - IEEE Spectrum ›
  • Google's New AI Is Learning to Diagnose Patients - IEEE Spectrum ›
  • What Ever Happened to IBM's Watson? - The New York Times ›
  • Watson (computer) - Wikipedia ›
  • IBM Watson Health | AI Healthcare Solutions | IBM ›
  • IBM Watson | IBM ›

This article is for IEEE members only. Join IEEE to access our full archive.

Membership includes:.

  • Get unlimited access to IEEE Spectrum content
  • Follow your favorite topics to create a personalized feed of IEEE Spectrum content
  • Save Spectrum articles to read later
  • Network with other technology professionals
  • Establish a professional profile
  • Create a group to share and collaborate on projects
  • Discover IEEE events and activities
  • Join and participate in discussions

Accessibility Links

  • Skip to content
  • Skip to search IOPscience
  • Skip to Journals list
  • Accessibility help
  • Accessibility Help

Click here to close this panel.

Purpose-led Publishing is a coalition of three not-for-profit publishers in the field of physical sciences: AIP Publishing, the American Physical Society and IOP Publishing.

Together, as publishers that will always put purpose above profit, we have defined a set of industry standards that underpin high-quality, ethical scholarly communications.

We are proudly declaring that science is our only shareholder.

A Survey on IBM Watson and Its Services

Avinash Kumar 1 , Pallapothala Tejaswini 1 , Omprakash Nayak 1 , Anurag Deep Kujur 1 , Rajkiran Gupta 1 , Ashish Rajanand 1 and Mridu Sahu 1

Published under licence by IOP Publishing Ltd Journal of Physics: Conference Series , Volume 2273 , International Conference on Applications of Intelligent Computing in Engineering and Science (AICES-2022)12th -13th Feb 2022 Online Citation Avinash Kumar et al 2022 J. Phys.: Conf. Ser. 2273 012022 DOI 10.1088/1742-6596/2273/1/012022

Article metrics

1147 Total downloads

Share this article

Author e-mails.

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

Author affiliations

1 Department Of Information Technology National Institute of Technology, Raipur

Buy this article in print

Artificial Intelligence (AI) is changing the modern way of lifestyle by helping the person do their jobs in an efficient manner. The AI is currently in its starting phase and from now on it is of great use. IBM Watson is an AI which is used globally by different organizations, institutes and corporations. In this paper we have created a chatbot using IBM Watson Assistance which is helpful in querying about the disease and hospitals related query. This paper also discusses IBM Watson in detail, its applications, its working and case studies on the use of IBM Watson in the field of healthcare, visual recognition and a software company named BOX.

Export citation and abstract BibTeX RIS

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence . Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

Introduction to "This is Watson"

In 2007, IBM Research took on the grand challenge of building a computer system that could compete with champions at the game of Jeopardy!™. In 2011, the open-domain question-answering (QA) system, dubbed Watson, beat the two highest ranked players in a nationally televised two-game Jeopardy! match. This paper provides a brief history of the events and ideas that positioned our team to take on the Jeopardy! challenge, build Watson, IBM Watson™, and ultimately triumph. It describes both the nature of the QA challenge represented by Jeopardy! and our overarching technical approach. The main body of this paper provides a narrative of the DeepQA processing pipeline to introduce the articles in this special issue and put them in context of the overall system. Finally, this paper summarizes our main results, describing how the system, as a holistic combination of many diverse algorithmic techniques, performed at champion levels, and it briefly discusses the team's future research plans. © 1957-2012 IBM.


  • David A. Ferrucci
  • Computer Science

Increasing electronic display information content: An introduction

Component rationing for available-to-promise scheduling in configure-to-order systems, a new label-based source routing for multi-ring networks.

  • Newsletters
  • Account Activating this button will toggle the display of additional content Account Sign out

How IBM’s Watson Went From the Future of Health Care to Sold Off for Parts

Most likely, you’re familiar with Watson from the IBM computer system’s appearance on Jeopardy! in 2011, when it beat former champions Ken Jennings and Brad Rudder. Watson’s time on Jeopardy! was fun viewing, but it was also a very savvy public debut of a product that IBM wanted to sell: Watson Health.

Watson Health was supposed to change health care in a lot of important ways, by providing insight to oncologists about care for cancer patients, delivering insight to pharmaceutical companies about drug development, helping to match patients with clinical trials, and more. It sounded revolutionary, but it never really worked. Recently, Watson Health was, essentially, sold for parts: Francisco Partners, a private equity firm, bought some of Watson’s data and analytics products for what Bloomberg News said was more than $1 billion.

On Friday’s episode of What Next: TBD, I spoke with Casey Ross , technology correspondent for Stat News, who has been covering Watson Health for years, about how Watson went from being the future of health care to being sold for scraps. Our conversation has been edited and condensed for clarity.

Lizzie O’Leary: I look at the amount of money that went into pulling this together. Acquisition after acquisition. It was billions of dollars, and it sold for a billion in the end. Is there any way to read that as anything but a failure?

Casey Ross: Financially, certainly not. They spent way more money building this than they got back. Just the acquisitions alone cost them $5 billion. That it was sold so many years later, after so much in effort—7,000 employees at one point—means that this will as a total failure that they needed to just cut their losses and move on.

Why did IBM want to get into the health data business? What problem did they think Watson would help solve?

There’s a tremendous amount of information that is collected every day on the care of hundreds of millions of people. However, there is currently no way to connect that information, to link it to an individual across all the domains in which they get care, and then to develop a holistic picture of who they are, of what their diseases are, of what the best treatments are, and how to ensure that they get the best care at the lowest possible cost. There is no connectivity right now that can do that at scale. The people in the technology sector look at it and say, “This has to be fixed, and we’re going to fix it.”

Google, Microsoft, a lot of very big companies are extremely interested in health care. What is so attractive for these big tech companies about health care?

It’s one of the biggest parts of our economy. It’s a three trillion business that has legacy technology infrastructure that should be embarrassing. Tech companies are drawn to audacious challenges like this, and ones where they can make—if they’re successful—a ton of money.

That’s how things are today, but the same problems have been around since the advent of digitized data. In 2012, IBM closed a deal with Memorial Sloan Kettering, one of the preeminent cancer centers in the country, to train an AI to make treatment recommendations. What was the goal? What were they trying to do?

They were really trying to democratize the expertise of Memorial Sloan Kettering’s oncologists, to make that expertise available to patients all over the world and to develop this standardized engine for providing optimal treatment recommendations, customized to a patient, in front of a doctor, thousands of miles away. It was a beautiful notion. They were trying to say, “Well, let’s make it more objective. Let’s look at all of the data, and let’s tell every physician, for this patient in front of you, this is how they should be treated.”

So you get your biopsy results, and things don’t look good, but you’re not just getting the expertise or the biases of your particular oncologist. You’re getting the wealth of thousands of oncologists distilled into an algorithm?

Yes, you are getting all of that data, across so many different physicians, crunched down into a very digestible format and recommendation that could then lead to the best treatment for that patient.

Reading your reporting, it sounds like this was incredibly important to IBM. In 2015, Ginni Rometty, who was the CEO at the time, went on Charlie Rose . She said health care was “our moonshot.” How much of IBM’s hopes were hung on this thing?

The company made a huge bet that this could be the bridge to a different kind of future for IBM, which at the time was several years of quarterly revenue declines. They were trying to use Watson as a bridge to a different future where IBM wasn’t this old guard hardware company that everybody knew so well, but was operating on the cutting edge of artificial intelligence. Health care was the biggest, the buzziest use case. This was where they were going to really show the surpassing value of their technology.

To do that, IBM needed massive amounts of data on which to train Watson. It got that data through acquisitions, eventually spending some $5 billion buying a series of health data companies. What were those companies?

Truven, Phytel, Explorys and merge. Truven had the biggest insurance database in the nation with 300 million covered lives, Explorys provided a clinical data set of actual electronic health records kept by health systems representing about 50 million or so patients, Phytel added on top of that, and Merge had a huge imaging database. They had all this data and the idea was: Expose Watson to that, and it finds patterns that physicians and anyone else can’t possibly find when looking at that data, given all the variables in it.

Except that was not the reality. One of IBM’s high-profile partnerships with MD Anderson Cancer Center in Texas fell apart. A doctor involved said that there wasn’t enough data for the program to make good recommendations, and that Watson had trouble with the complexity of patient files. The partnership was later audited and shelved. What went wrong?

If you think about it, knowing what we know now or what we’ve learned through this, the notion that you’re going to take an artificial intelligence tool, expose it to data on patients who were cared for on the upper east side of Manhattan, and then use that information and the insights derived from it to treat patients in China, is ridiculous. You need to have representative data. The data from New York is just not going to generalize to different kinds of patients all the way across the world.

What was happening in a clinical setting? What was happening to patients?

Our window through the reporting was talking to physicians. We got concerns from them that the recommendations that it was giving were just not relevant. Maybe it would suggest a particular kind of treatment that wasn’t available in the locality in which it was making the recommendation, or the recommendation did not at all square with the treatment protocols that were in use at the local institution or, and more commonly so, especially in the U.S. and Europe, “you’re not telling me anything I don’t already know.” That was the big credibility gap for physicians. It was like, “Well duh. Yeah, I know that that’s the chemotherapy I should pursue. I know that this treatment follows that one.”

You got a hold of an internal IBM presentation from 2017 where a doctor at a hospital in Florida told the company this product was a piece of shit .

Seeing that written down in an internal document, which was circulated among IBM executives, was a shocking thing to see. It really underscored the extent of the gap between what IBM was saying in public and what was happening behind the scenes.

There were a lot of internal discussions, even a presentation, that indicated that the technology was not as far along as they’d hoped, that it wasn’t able to accomplish what they set out to accomplish in cancer care. There were probably a lot of people that believed, that truly did believe, that they would get there or that it was closer than maybe some people realized. I think the marketing got way ahead of the capabilities.

It’s very hard to listen to you and not think about Theranos , even though this is not a one-to-one parallel in any way. When you are trying to move by leaps and bounds with technology in the health care sector, it feels like a reminder that all things are not created equal, that making big leaps with people’s health is a much riskier proposition.

That underscores the central theme of this story: When you try to combine the bravado of the tech culture and the notion that you can achieve these huge audacious goals in a domain where you’re dealing with people’s lives and health and the most sacrosanct aspects of their existence and their bodies, you need to have evidence to back up that you can do what you say you can do.

Why did they continue on trying to rescue this product that they seemed to know internally was failing?

I think they had so much invested in it that it really was, for them, too big to fail. It had 7,000 employees. They’d invested so much time and energy on marketing in the success of the product that they really needed it to succeed.

Instead, they got a fail. But Watson’s fate certainly doesn’t mean that AI in health care is going away. Just recently, Microsoft and a large group of hospitals announced a coalition to develop AI solutions in health care. If you had to pin down a moral to the story, is it that AI in health care isn’t ready for prime time, or that IBM did it wrong?

I think it’s both of those. This will be a case study for business schools for decades. When you look at what IBM did and the strategy mistakes, the tactical errors that they made in pursuing this product, they made a lot of unforced errors here. It’s also true that the generation of technology that they had was nowhere near ready to accomplish the things that they set out to accomplish and promised that they could accomplish. I don’t think that the failure of Watson means that artificial intelligence isn’t ready to make significant improvements and changes in health care. I think it means the way that they approached it is a cautionary tale that lays out how not to do it.

Does the failure of Watson Health make you worry that it’s going to shut down other avenues for innovation? Will such a spectacular belly flop impede progress?

I don’t think so. There were so many mistakes that were made, that were learned from, that, if anything, it will facilitate faster learning and better decision making by other parties that are now poised to disrupt health care and make the progress that IBM failed to achieve. There’s a saying that pioneers often end up with arrows in their backs, and that’s what happened here. They’re an example, a spectacular example, of wrongheaded decision making and missteps that didn’t have to happen. By learning from that, I think advancement and progress and true benefits will be faster coming.

Future Tense is a partnership of Slate , New America , and Arizona State University that examines emerging technologies, public policy, and society.

comscore beacon

Suggestions or feedback?

MIT News | Massachusetts Institute of Technology

  • Machine learning
  • Social justice
  • Black holes
  • Classes and programs


  • Aeronautics and Astronautics
  • Brain and Cognitive Sciences
  • Architecture
  • Political Science
  • Mechanical Engineering

Centers, Labs, & Programs

  • Abdul Latif Jameel Poverty Action Lab (J-PAL)
  • Picower Institute for Learning and Memory
  • Lincoln Laboratory
  • School of Architecture + Planning
  • School of Engineering
  • School of Humanities, Arts, and Social Sciences
  • Sloan School of Management
  • School of Science
  • MIT Schwarzman College of Computing

IBM and MIT to pursue joint research in artificial intelligence, establish new MIT-IBM Watson AI Lab

Press contact :, media download.

ibm watson research articles

*Terms of Use:

Images for download on the MIT News office website are made available to non-commercial entities, press and the general public under a Creative Commons Attribution Non-Commercial No Derivatives license . You may not alter the images provided, other than to crop them to size. A credit line must be used when reproducing images; if one is not provided below, credit the images to "MIT."

ibm watson research articles

Previous image Next image

IBM and MIT today announced that IBM plans to make a 10-year, $240 million investment to create the MIT–IBM Watson AI Lab in partnership with MIT. The lab will carry out fundamental artificial intelligence (AI) research and seek to propel scientific breakthroughs that unlock the potential of AI. The collaboration aims to advance AI hardware, software, and algorithms related to deep learning and other areas; increase AI’s impact on industries, such as health care and cybersecurity; and explore the economic and ethical implications of AI on society. IBM’s $240 million investment in the lab will support research by IBM and MIT scientists.

The new lab will be one of the largest long-term university-industry AI collaborations to date, mobilizing the talent of more than 100 AI scientists, professors, and students to pursue joint research at IBM's Research Lab in Cambridge, Massachusetts — co-located with the IBM Watson Health and IBM Security headquarters in Kendall Square — and on the neighboring MIT campus.

The lab will be co-chaired by Dario Gil, IBM Research VP of AI and IBM Q, and Anantha P. Chandrakasan, dean of MIT’s School of Engineering. (Read a related Q&A with Chandrakasan.) IBM and MIT plan to issue a call for proposals to MIT researchers and IBM scientists to submit their ideas for joint research to push the boundaries in AI science and technology in several areas, including:

  • AI algorithms: Developing advanced algorithms to expand capabilities in machine learning and reasoning. Researchers will create AI systems that move beyond specialized tasks to tackle more complex problems and benefit from robust, continuous learning. Researchers will invent new algorithms that can not only leverage big data when available, but also learn from limited data to augment human intelligence.
  • Physics of AI: Investigating new AI hardware materials, devices, and architectures that will support future analog computational approaches to AI model training and deployment, as well as the intersection of quantum computing and machine learning. The latter involves using AI to help characterize and improve quantum devices, and researching the use of quantum computing to optimize and speed up machine-learning algorithms and other AI applications.
  • Application of AI to industries: Given its location in IBM Watson Health and IBM Security headquarters in Kendall Square, a global hub of biomedical innovation, the lab will develop new applications of AI for professional use, including fields such as health care and cybersecurity. The collaboration will explore the use of AI in areas such as the security and privacy of medical data, personalization of health care, image analysis, and the optimum treatment paths for specific patients.
  • Advancing shared prosperity through AI : The MIT–IBM Watson AI Lab will explore how AI can deliver economic and societal benefits to a broader range of people, nations, and enterprises. The lab will study the economic implications of AI and investigate how AI can improve prosperity and help individuals achieve more in their lives.

In addition to IBM’s plan to produce innovations that advance the frontiers of AI, a distinct objective of the new lab is to encourage MIT faculty and students to launch companies that will focus on commercializing AI inventions and technologies that are developed at the lab. The lab’s scientists also will publish their work, contribute to the release of open source material, and foster an adherence to the ethical application of AI.

“The field of artificial intelligence has experienced incredible growth and progress over the past decade. Yet today’s AI systems, as remarkable as they are, will require new innovations to tackle increasingly difficult real-world problems to improve our work and lives,” says John Kelly III, IBM senior vice president, Cognitive Solutions and Research. “The extremely broad and deep technical capabilities and talent at MIT and IBM are unmatched, and will lead the field of AI for at least the next decade.”

“I am delighted by this new collaboration,” MIT President L. Rafael Reif says. “True breakthroughs are often the result of fresh thinking inspired by new kinds of research teams. The combined MIT and IBM talent dedicated to this new effort will bring formidable power to a field with staggering potential to advance knowledge and help solve important challenges.”

Both MIT and IBM have been pioneers in artificial intelligence research, and the new AI lab builds on a decades-long research relationship between the two. In 2016, IBM Research announced a multiyear collaboration with MIT’s Department of Brain and Cognitive Sciences to advance the scientific field of machine vision, a core aspect of artificial intelligence. The collaboration has brought together leading brain, cognitive, and computer scientists to conduct research in the field of unsupervised machine understanding of audio-visual streams of data, using insights from next-generation models of the brain to inform advances in machine vision. In addition, IBM and the Broad Institute of MIT and Harvard have established a five-year, $50 million research collaboration on AI and genomics.

MIT researchers were among those who helped coin and popularize the very phrase “artificial intelligence” in the 1950s. MIT pushed several major advances in the subsequent decades, from neural networks to data encryption to quantum computing to crowdsourcing. Marvin Minsky, a founder of the discipline, collaborated on building the first artificial neural network and he, along with Seymour Papert, advanced learning algorithms. Currently, the Computer Science and Artificial Intelligence Laboratory, the Media Lab, the Department of Brain and Cognitive Sciences, the Center for Brains, Minds and Machines, and the MIT Institute for Data, Systems, and Society serve as connected hubs for AI and related research at MIT.

For more than 20 years, IBM has explored the application of AI across many areas and industries. IBM researchers invented and built Watson, which is a cloud-based AI platform being used by businesses, developers, and universities to fight cancer, improve classroom learning, minimize pollution, enhance agriculture and oil and gas exploration, better manage financial investments, and much more. Today, IBM scientists across the globe are working on fundamental advances in AI algorithms, science and technology that will pave the way for the next generation of artificially intelligent systems.

For information about employment opportunities with IBM at the new AI Lab, please visit

Share this news article on:

Press mentions.

Forbes reporter Maribel Lopez writes about how researchers at the MIT-IBM Watson AI Lab are tackling a variety of AI challenges with real-world applications. Lopez notes that it’s great to see organizations like MIT and IBM coming together to “bridge the gap between science and practical AI solutions that can be used for both commercial and social good.”

Associated Press

IBM is joining forces with MIT to establish a new lab dedicated to fundamental AI research, reports the AP. The new lab will focus on, “advancing the hardware, software and algorithms used for artificial intelligence. It also will tackle some of the economic and ethical implications of intelligent machines and look at its commercial application.”

CNBC reporter Jordan Novet writes that MIT and IBM have established a new lab to pursue fundamental AI research. Novet notes that MIT, “was home to one of the first AI labs and continues to be well regarded as a place to do work in the sector.”

Fortune- CNN

Writing for Fortune , Barb Darrow highlights how IBM has committed $240 million to establish a new joint AI lab with MIT. Darrow explains that, “the resulting MIT–IBM Watson AI Lab will focus on a handful of key AI areas including the development of new 'deep learning' algorithms.”

IBM has invested $240 million to develop a new AI research lab with MIT, reports Jing Cao for Bloomberg News. “The MIT-IBM Watson AI Lab will fund projects in four broad areas, including creating better hardware to handle complex computations and figuring out applications of AI in specific industries,” Cao explains. 

Boston Globe

Boston Globe reporter Andy Rosen writes that MIT and IBM have established a new AI research lab.  “It’s amazing that we have a company that’s also interested in the fundamental research,” explains Anantha Chandrakasan, dean of the School of Engineering. “That’s very basic research that may not be in a product next year, but provides very important insights.”

Previous item Next item

Related Links

  • MIT–IBM Watson AI Lab

Related Topics

  • Artificial intelligence
  • Computer science and technology
  • Collaboration
  • Cambridge, Boston and region
  • Computer Science and Artificial Intelligence Laboratory (CSAIL)
  • Brain and cognitive sciences
  • President L. Rafael Reif
  • MIT-IBM Watson AI Lab

Related Articles

Anantha Chandrakasan, dean of MIT’s School of Engineering, who negotiated the collaboration on behalf of MIT, speaks during the signing of the agreement with representatives of IBM.

3Q: Anantha Chandrakasan on new MIT–IBM Watson AI Lab

MIT Professor Daniela Rus, director of CSAIL, said the goal of a new SystemsThatLearn@CSAIL initiative is "to create a new generation of AI tools that are deeply rooted in systems."

CSAIL launches artificial intelligence initiative with industry

ibm watson research articles

Genuine enthusiasm for AI

The Ethics and Governance of Artificial Intelligence Fund will support work around the world that advances the development of artificial intelligence in the public interest.

MIT Media Lab to participate in $27 million initiative on AI ethics and governance

More mit news.

A little girl lies on a couch under a blanket while a woman holds a thermometer to the girl's mouth.

Understanding why autism symptoms sometimes improve amid fever

Read full story →

Three rows of five portrait photos

School of Engineering welcomes new faculty

Pawan Sinha looks at a wall of about 50 square photos. The photos are pictures of children with vision loss who have been helped by Project Prakash.

Study explains why the brain can robustly recognize images, even without color

Illustration shows a red, stylized computer chip and circuit board with flames and lava around it.

Turning up the heat on next-generation semiconductors

Sarah Milholland stands in front of an MIT building on a sunny day spring day. Leaves on the trees behind her are just beginning to emerge.

Sarah Millholland receives 2024 Vera Rubin Early Career Award

Grayscale photo of Nolen Scruggs seated on a field of grass

A community collaboration for progress

  • More news on MIT News homepage →

Massachusetts Institute of Technology 77 Massachusetts Avenue, Cambridge, MA, USA

  • Map (opens in new window)
  • Events (opens in new window)
  • People (opens in new window)
  • Careers (opens in new window)
  • Accessibility
  • Social Media Hub
  • MIT on Facebook
  • MIT on YouTube
  • MIT on Instagram

Introduction to “This is Watson”

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

More From Forbes

Ibm extends reach of watsonx with products and partnerships.

  • Share to Facebook
  • Share to Twitter
  • Share to Linkedin

One of several new products unveiled at IBM's Think 2024 event was Concert, a genAI-powered ... [+] automation tool designed to parse through an enterprise's applications looking for potential problems.

As excited as many companies may be about integrating generative AI into their organizations, many are discovering there are still lots of small stumbling blocks along the way. At this week’s Think event, IBM worked to smooth out several of these bumps via enhancements to its watsonx AI platform as well as a wide range of new partnerships with other big tech companies.

A year after IBM first unveiled watsonx, the company added several capabilities to it that extend the platform even deeper into the world of assisted programming tools, automation-focused digital assistants and more. Early implementations of GenAI have shown that coding assistance has proven to be a very effective application of the technology, so it makes sense for IBM to expand its offerings there. Plus, a number of its efforts help address another challenge that some long-time IBM customers are currently facing: modernizing apps for Z mainframes and some of IBM’s other older technologies.

The new Code Assistant for Z and Code Assistant for specialized domains leverage IBM’s newly updated assistant technology as well as its own open-source Granite code foundation models, which were also announced at Think. Both assistants are designed to address the tough, real-world issues facing organizations that no longer have employees who know how to program in COBOL and other older languages but still depend on some of these applications for critical functions within their companies. IBM touted that the Granite code models outperformed many other larger open-source models on key coding benchmarks.

IBM also had new tools for addressing the challenge of data ingestion, again using its new Granite models. A tuned 20 billion parameter version of the Granite base code model was optimized to generate SQL commands from natural language interactions, making the process of integrating data from older SQL databases into GenAI inferencing workloads and applications significantly easier.

In addition to IBM’s own models, the company also announced partnerships with AWS, Meta, Adobe, Microsoft, Mistral, Salesforce, and SAP among others to integrate and or co-develop new models for integration into watsonx. While the specifics of the partnerships vary, all provide an easier way for customers who work with both IBM and the partner companies to use their tools together—another challenge that’s plagued some early GenAI implementations.

Saw The Eclipse And Aurora Now Comes A Third Once In A Lifetime Event

Netflix’s new #1 movie is an overlooked, must-watch crime comedy thriller, reacher season 3 casts a villain that looks like he ate reacher.

Of particular note, IBM’s partnership with AWS brings IBM’s watsonx.governance features and capabilities into Amazon’s SageMaker service for building and training AI and ML models. This is a great win for both companies as it allows IBM to extend its expertise in governance to a much broader set of customers and gives Amazon customers a significantly more robust way of managing the issues around AI model governance. Plus, it provides a great potential onramp for AWS customers to explore more of watsonx as all of it is now available on the AWS Marketplace in conjunction with this announcement.

One of the most intriguing technology-focused announcements from IBM was the debut of something called InstructLAB, which was developed in conjunction with IBM Research and RedHat. Basically, InstructLAB provides an easier and more efficient manner of training and fine-tuning models through the use of synthetic data. In some ways the technology is similar to RAG (Retrieval Augmented Generation) in that it can help improve the quality and specificity of the output from a model without needing to do full model training. The key difference is that RAG achieves this by pulling in data from a local dataset as it performs each query to refine the results, while InstructLAB is designed to extend the capabilities and knowledge base of the model itself. Given how rapidly developments in GenAI are occurring, both RAG and InstructLAB could end up being little more than blips along the arc of GenAI evolution, but it’s still great to see IBM pushing these kinds of technologies forward.

On the application development front, IBM also unveiled what it’s calling deployable architectures. Put simply, these are like templates for the application building process that integrate prebuilt infrastructure elements and cloud connections along with a number of best practices that IBM’s consulting group has learned over the last few years. IBM refers to deployable architectures as composable infrastructure, and what they offer is a complete hybrid cloud environment that can be easily and quickly spun up as organizations begin development efforts for a given project. More importantly, they’re expected to be significant time savers and early results with customers are proving this out.

The last big piece of news from Think was the debut of IBM Concert, which the company describes as “the ‘nerve center’ of an enterprise’s technology and operations.” Focused on automation and powered by GenAI, Concert is designed to parse through an organization’s complete set of applications and look for potential problems or security issues, suggest fixes or enhancements, and more. Concert also integrates the knowledge of best practices that IBM has compiled and leverages connections to CI/CD pipelines, cloud-based data stores and other application management applications. The initial version of Concert will be focused on application risk and compliance management related issues.

While the variety of tools that IBM unveiled at Think is quite wide, ultimately, they are all intended to help organizations ease the process of integrating GenAI into their organizations. That may not be as dramatic as the launch of watsonx from last year’s event, but all these tools do represent steps along the path towards real-world GenAI deployments.

Disclosure: TECHnalysis Research is a tech industry market research and consulting firm and, like all companies in that field, works with many technology vendors as clients, some of whom may be listed in this article.

Bob O'Donnell

  • Editorial Standards
  • Reprints & Permissions

Join The Conversation

One Community. Many Voices. Create a free account to share your thoughts. 

Forbes Community Guidelines

Our community is about connecting people through open and thoughtful conversations. We want our readers to share their views and exchange ideas and facts in a safe space.

In order to do so, please follow the posting rules in our site's  Terms of Service.   We've summarized some of those key rules below. Simply put, keep it civil.

Your post will be rejected if we notice that it seems to contain:

  • False or intentionally out-of-context or misleading information
  • Insults, profanity, incoherent, obscene or inflammatory language or threats of any kind
  • Attacks on the identity of other commenters or the article's author
  • Content that otherwise violates our site's  terms.

User accounts will be blocked if we notice or believe that users are engaged in:

  • Continuous attempts to re-post comments that have been previously moderated/rejected
  • Racist, sexist, homophobic or other discriminatory comments
  • Attempts or tactics that put the site security at risk
  • Actions that otherwise violate our site's  terms.

So, how can you be a power user?

  • Stay on topic and share your insights
  • Feel free to be clear and thoughtful to get your point across
  • ‘Like’ or ‘Dislike’ to show your point of view.
  • Protect your community.
  • Use the report tool to alert us when someone breaks the rules.

Thanks for reading our community guidelines. Please read the full list of posting rules found in our site's  Terms of Service.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Yearb Med Inform
  • v.9(1); 2014

Logo of ymi

IBM’s Health Analytics and Clinical Decision Support

1 Jointly Health (formerly IBM Research)

2 College of Computing, Georgia Institute of Technology, Atlanta, Georgia (formerly IBM Research)

3 IBM Almaden Research Center, San Jose, CA, USA

4 Records of Health (formerly IBM Research), Haifa, Israel

5 IBM Haifa Research Lab, Haifa, Israel

6 IBM Watson Research Center, Yorktown Heights NY, USA

T. Syed-Mahmood

7 IBM Watson Solutions Development, Rochester MN, USA

This survey explores the role of big data and health analytics developed by IBM in supporting the transformation of healthcare by augmenting evidence-based decision-making.

Some problems in healthcare and strategies for change are described. It is argued that change requires better decisions, which, in turn, require better use of the many kinds of healthcare information. Analytic resources that address each of the information challenges are described. Examples of the role of each of the resources are given.

There are powerful analytic tools that utilize the various kinds of big data in healthcare to help clinicians make more personalized, evidenced-based decisions. Such resources can extract relevant information and provide insights that clinicians can use to make evidence-supported decisions. There are early suggestions that these resources have clinical value. As with all analytic tools, they are limited by the amount and quality of data.

Big data is an inevitable part of the future of healthcare. There is a compelling need to manage and use big data to make better decisions to support the transformation of healthcare to the personalized, evidence-supported model of the future. Cognitive computing resources are necessary to manage the challenges in employing big data in healthcare. Such tools have been and are being developed. The analytic resources, themselves, do not drive, but support healthcare transformation.


There are many challenges to healthcare. Improving healthcare, both in terms of quality and cost including the reduction of waste, is a global imperative. There are estimates that 21%-47% of what is spent on healthcare in the United States is for interventions of no value [ 1 ]. The percentages may be different in other countries, but the problem is a global problem. One concept that can help address current limitations is called personalized or precision healthcare. We know that standard treatments for many diseases are not effective in all patients. Some patients receive no benefit from, and are possibly harmed by, routine interventions. The inability to identify patients who need an alternate treatment accounts for some of the poor clinical and economic results in the current healthcare model. With personalized healthcare, we learn enough about a patient, and relevant healthcare information, to help make choices that are more likely to benefit that patient. For example, if we can predict which diabetic or cancer patient needs a different therapy then we may improve outcomes and save money by not employing an ineffective or potentially dangerous treatment. A concomitant of personalized healthcare is the need to make evidence-supported decisions. We will only transform healthcare if we can effectively use all the information available to us to make better decisions. Although personalized healthcare is often discussed in the context of genomics, the idea is more than 40 years old and much broader than just genomics [ 2 ]. Using existing information, whether it is recorded in a medical record, a research journal, or a gene sequence is part of personalized healthcare.

The volume of healthcare data available is huge, varied, challenging to use, and as a consequence, described as “Big Data”. The data may be unstructured, sometimes called free-text or natural language, as in journal articles, textbooks, guidelines or the narrative parts in electronic health records (EHRs). Additional unstructured data includes stored images, such as x-rays or echocardiograms. There is also structured data, such as numerical entries in EHRs, genomic sequences and streaming data, such as physiologic monitoring in an intensive care unit (ICU). There is so much data, in so many forms, that individuals are only able to use small amounts of the data. One of the most challenging characteristics of big data is variability. Big data is marked by ambiguity, conflict, and inconsistency. Conventional programmatic computing, where a computer is programmed to process a known data set, is not adequate for managing the volume and inconsistency in big data. As much as 80% of the world’s data may be uncertain by 2015 [ 3 ]. Big data requires cognitive computing, using data-centric, probabilistic approaches to data, where, after a fashion, the computer “thinks.”Based on human reasoning, cognitive computing identifies complex associations, draws inferences, and learns from experience [ 4 ]. It is designed to navigate complex, dynamic, uncertain environments [ 5 ]. IBM has developed an array of cognitive analytic tools to gain insight from all types of healthcare information. We divide these resources into two broad categories. The first is knowledge-driven decision support, designed to gather insight from existing vetted knowledge, such as journal articles, textbooks, guidelines, or protocols. The second is data-driven decision support, which looks for patterns in real world, existing data, predominantly structured or image data, but could include text such as the narrative part of EHRs. The two forms of decision support can overlap and augment each other, and some resources will involve a fusion of the two forms, in the effort to provide meaningful insight for decisions at the point of care.

This paper describes the importance of big data in healthcare. The authors seek to present examples of cognitive computing resources developed by IBM that can be used to analyze and draw inferences from the different kinds of data to help achieve evidence-supported decisions. The tools described here address specific areas of the use of data for decision support. They are not all encompassing, but are important in that they address currently existing sources of information applicable to healthcare. As more and different kinds of information become available, new tools will be necessary to incorporate them into decision support. Decision support is only one component of clinical informatics, which is, in turn, one component of the transformation of healthcare. Issues such as healthcare financial coverage, access, workflow, political obstacles, and methods to encourage clinicians and patients to focus on better outcomes also need to be addressed. For example, brain-mapping techniques have the potential to improve diagnosis and management of behavioral and neurologic diseases [ 6 ]. Organized medicine in the US has recognized the importance of clinical informatics by creating the medical subspecialty of Clinical Informatics in 2013, under the guidance of the American Board of Preventive Medicine.

These tools are developed for use by clinicians, but ideally in concert with patients. A goal for the future of healthcare is sometimes described as the empowered, knowledgeable patient. It is thought that such a patient, as an active participant in the decision making and care planning process, will also benefit from decision support tools. The patient will, hopefully, become a better manager of his or her health.

Natural Language Processing Analytics for Unstructured Data

Keeping up with the vast amount of literature published each year is a major challenge for clinicians. In the year 2010, the National Library of Medicine in the United States catalogued 699,000 new articles [ 7 ]. How much important published information was never used for clinical decisions? Most physicians have fewer than three to five hours a week to read, and usually only read from two or three journals [ 8 ] Before the internet, collecting articles about a subject was a time consuming challenge. A physician had to go to the library and spend hours using an array of indices and bibliographies to find a handful of articles to read to study a subject in any depth. Today the problem is reversed. On-line search engines can overwhelm a reader with thousands of links or articles based on keywords entered. The challenge now is not amassing references, but processing, filterin,g and analyzing potentially thousands of sources to find the really helpful insights. Watson is a resource that processes those thousands of sources.

Watson was originally developed to prove that a computer could understand natural language, the language of communication, generate and evaluate hypotheses, and adapt and learn with interaction, outcomes and new information. It demonstrated that ability by successfully playing the television quiz game “Jeopardy!” in early 2011. It uses what its inventors call massively parallel probabilistic algorithms, designed to analyze and understand the English language [ 9 ].“Jeopardy!” was intentionally chosen as the arena in which to demonstrate Watson’s skill. If Watson could understand the arcane language used for the “Jeopardy!” clues and create an appropriate response then its ability would be clearly demonstrated. Now, Watson for Healthcare is being developed to make it easier for clinicians to use material such as journal articles combined with historical clinical knowledge to achieve evidence-supported decisions. Clinicians are challenged by the overwhelming amount of published healthcare information. Healthcare professionals might like to read and remember more of the available literature in order to make better decisions. Watson can read and analyze concepts in millions of pages of medical information in seconds, identify information that could be relevant to a decision facing a clinician, and offer options for the decision maker to consider. Thus, Watson will be the physician’s assistant to give the advantage of the recall of the information in literature that a provider cannot get herself as well as help in analyzing the literature.

Watson reads and understands concepts in English. Currently, Watson is learning to help oncologists consider therapies for cancer patients. It can improve its performance through machine learning, a process by which Watson teaches itself. Watson is provided information from the patient’s EHR. Through training by expert clinicians, it identifies the critical attributes, and can then review relevant literature, including care guidelines. Watson processes all this information and then provides a ranked list of possible therapy options for the oncologist to consider with her patient. Watson is also being taught to consider patient preferences in evaluating options. Watson is not prescriptive. It provides a list of evidence-based hypotheses for the decision maker to consider, not a dictated prescription. It serves as decision support, not as a decision maker.

Watson is still learning and the technology behind Watson continues to evolve at a rapid pace. Architecturally, Watson is a pluggable solution that can be easily expanded with new or updated algorithms, as they are needed. Just like humans use multiple learned techniques to observe the world and solve problems, Watson is a collection of overlapping reasoning algorithms that address specific portions of the pipeline used for problem understanding and problem-solving.

For example, a specific instance of Watson might have specialized algorithms for understanding the question (Natural Language Processing, query expansion with synonyms, dictionaries, ontologies, language translation, speech translation, spelling correction), making hypotheses (indexing a corpus of data, searching for relevant passages, concept annotations, passage expansion, passage filtering, passage scoring), answer selection and scoring (deep parsing, semantic matching, answer similarity, lexical matching, temporal reasoning, geospatial reasoning, negation, knowledge graphs), machine learning (logistical regression, Bayesian networks, similarity learning), and dialoging with the user (resolving missing and conflicting information, disambiguation, providing suggestions, providing supporting evidence). Some of these algorithms may be generic enough for all applications, and others may be optimized for a specific domain or sub-specialty area.

As Watson is used in more and more engagements, these algorithms improve and their scope broadens, which will allow Watson to be used in a large variety of situations and a broad range of industries. The future is indeed bright for cognitive computing.

Watson accepts “questions” in natural language, so you don’t have to rely on expressing your question in structured data. In some cases, such as the oncology treatment solution, under development, the question is implied: “What are the recommended treatment options for this patient?” The keywords are the facts of the case that are extracted from the patient’s medical records. The clinician using the tool has the option of reviewing these keywords and making last minute changes that may not be reflected in the case – for example, the patient may no longer experience nausea. Armed with this information, Watson can use its trained models to weigh all these facts against numerous treatment options specified by national guidelines, insights from medical experts, and other medical information, and rank those options appropriately. Watson also serves as a discovery tool, by “showing its work”. Watson can show a user the documents (including guidelines, articles, text books, and other knowledge sources) it used to arrive at its hypotheses as well as the key supporting evidence or refuting evidence that was used to rank these hypotheses. Being able to see these details goes a long way toward understanding the rationale used and thus the options presented. Watson also learns and improves through training and repetition from clinician selections and responses. Just as it improved its skill at “Jeopardy!” by getting feedback about the usefulness of the hypotheses, adjusting its algorithms and rating its sources of information; Watson is similarly improving its ability to identify relevant treatment options for certain types of cancer.

Data-Driven Decision Support

Healthcare systems generate and store huge amounts of data. There is valuable information hidden in that data, hidden in patterns that cannot be readily recognized by the human eye and brain. Analyzing existing data, sometimes described as secondary use or re-use of data, actually creates new evidence. What important insights could be gleaned from the EHRs of a large population of patients that could be used to make better decisions for individual patients? Detecting novel correlations changes from aserendipitous event to organized discovery. What patterns could a computer detect in the massive stream of physiologic data in an ICU that would allow clinicians to identify serious problems earlier when treatment could be simpler and more likely to be successful?

Efforts to develop computerized applications for clinical decision support (CDS) started decades ago, building on rule-based expert systems [ 10 ]. These efforts have not been very successful, mainly due to the difficulty in formulating predefined rules that faithfully and completely describe all possible care processes [ 11 ]. This task has become even more complex, because of the enormous amount of new health data (e.g., genomics, sensors, imaging), much of it is of unknown significance or sufficiently ambiguous that it could not be incorporated into authoritative clinical practice guidelines [ 12 ]. We describe a spectrum of cognitive computing tools to overcome these difficulties.

Patient Similarity Analytics

One of the limitations of published healthcare studies is that they often address one specific condition. Learning about patients with multiple chronic problems in a real world context can be difficult. Using existing data about other patients that are very similar to the patient provides useful information that can be leveraged for making better decisions. With the tremendous growth of the adoption of EHRs, various sources of information are becoming available about patients. A key challenge is to identify effective secondary uses of EHR data to help improve patient outcome without generating additional burdens on physicians. Patient similarity derives a relationship measure between a pair of patients based on their EHR data. It enables case-based retrieval of patients similar to an index patient, treatment comparison among the cohorts of patients similar to the index patient and cohort comparison and comparative effectiveness research.

Deriving meaningful patient similarity measures requires integrating physician input. We created a suite of approaches to encode physician input as supervised information to guide the development of the similarity measure to address the following questions:

  • How to adjust the similarity measure according to physician feedback?
  • How to interactively update the existing similarity measure efficiently based on new feedback?
  • How to combine different similarity measures from multiple physicians?

First, physician feedback provides locally supervised metric learning (LSML) [ 13 ] to define a generalized Mahalanobis measure to adjust the distance measure among patients consistent with the clinical context. We construct two sets of neighborhoods for each training patient based on an initial distance measure. In particular, the homogeneous neighborhood of the index patient is the set of retrieved patients that are close in distance measure to the index patient and are also considered similar by the physician; the heterogeneous neighborhood of the index patient is the set of retrieved patients that are close in distance measure to the index patient but are considered not similar by the physician. Given these two definitions, both homogeneous (containing true positives) and heterogeneous (containing false positives) neighborhoods are constructed for all patients in the training data. Then we formulate an optimization problem that tries to maximize the homogeneous neighborhoods, and at the same time minimizing the heterogeneous neighborhoods.

Second, the interactive Metric learning (“IML”) method incorporates additional feedback that can incrementally adjust the underlying distance metric based on latest supervision information [ 14 ] IML is designed to scale linearly with the data set size based on matrix perturbation theory, which allows the derivation of a sound theoretical foundation. Our empirical results demonstrate that IML outperforms the baseline by three orders of magnitude in speed while obtaining comparable accuracy on several benchmark datasets.

Third, to combine multiple similarity measures (one from each physician), we first construct discriminative neighborhoods from each individual metrics, then combine them into a single optimal distance metric. We formulate this problem as a quadratic optimization problem and propose an efficient strategy to find the optimal solution [ 15 ]. Besides creating a globally consistent metric, this approach provides an elegant way to share knowledge across multiple experts (physicians) without sharing the underlying data, thus preserving privacy. Through our experiments on real claims datasets, we have shown improvement of classification accuracy as we incorporate feedback from multiple physicians.

All three techniques address different aspects of operationalizing patient similarity in the clinical application. Locally supervised metric learning can be used to define the distance metric in the batch mode, where large amounts of evidence are first obtained to form the training data. The training data should consist of clinical features of patients such as diagnosis, medication, lab results, demographics and vitals, and physician feedback about whether pair of patients are similar or not. For example, one simple type of feedback is binary indicator about each retrieved patient, where 1 means the retrieved patient is similar to the index patient and 0 means (s)he is not similar. Then the supervised similarity metric can be learned over the training data using LSML algorithm. Finally, the learned similarity can be used in various applications for retrieving a cohort of patients similar to a target patient. The other techniques address related challenges of using a supervised metric, such as updating the learned similar metric with new evidence efficiently, and how to combine multiple physicians’ opinions. Obtaining high quality training data is very important but often challenging, since it typically imposes overhead on users, who are busy physicians. These learning techniques have the potential to minimize the physician burden.

We have conducted preliminary evaluation of all the proposed methods using historical claims data consisting of 200,000 patients over three years from group of primary care practices. Heart failure diagnosis codes assigned by physicians are considered as the supervision information, while all other information (e.g., other diagnosis codes) is used as input features. The goal is to learn the similarity that cluster heart failure patients more closely, while pushing other patients far away from heart failure patients. Classification performance based on the target diagnosis is used as the evaluation metric. Our initial results show significant improvements over many baseline distance metrics in all three settings [ 13 , 14 , 15 ].

Medical Sieve

Another big data challenge in medicine is the effort that is required to review, interpret, and extract the maximum relevant information from across the wide variety of healthcare data and use it for decision making. Electronic patient data is distributed in many enterprise systems in hospitals, and obtaining a holistic perspective of patient condition is difficult and time consuming particularly for those specialists that already look at a lot of patient imaging studies such as radiologists. Statistics show that eye fatigue is a common problem with radiologists as they visually examine a large number of images per day. An emergency room radiologist may look at as many as 200 cases a day, and some of these imaging studies, particularly lower body CT angiographies can be as many as 3000 images per study. Due to the volume overload, and limited amount of clinical information available as part of imaging studies, diagnosis errors, particularly related to coincidental diagnoses can occur. With radiologists being a scarce resource in many countries, it will be even more important to reduce the volume of data necessary for clinicians, especially since it may have to be sent over low bandwidth tele-radiology networks.

IBM Research is developing a new radiologist cognitive assistant called the Medical Sieve, which is an image-guided informatics system that filters the essential clinical information that physicians need to know about a patient for diagnosis and treatment planning. The system gathers clinical data about the patient from a variety of enterprise systems in hospitals including EHR, pharmacy, labs, Admission-Discharge-Transfer system, and radiology/cardiology Picture archiving and communication systems using HL7 and DICOM adapters. It then uses sophisticated medical text and image processing, pattern recognition, and machine learning techniques guided by advanced clinical knowledge, to process clinical data about the patient to extract meaningful summaries for detecting the anomalies. In doing so, it exhibits a deep understanding of diseases and their interpretation in multiple modalities (X-ray, Ultrasound, CT, MRI, PET, Clinical text) covering various radiology and cardiology specialties. Finally, it creates advanced summaries of imaging studies capturing the salient anomalies detected in various viewpoints.

Medical Sieve algorithms were evaluated for anomaly detection accuracy in many diagnostic imaging modalities in specialties ranging from cardiac, to breast, neuro and musculoskeletal imaging. Specifically, we evaluated the accuracy of discrimination between normal and abnormal left ventricular shapes in a recent publication, in which the left ventricle was automatically located in 4-chamber views and was fitted with a prolate spheroidal model [ 16 ]. The ellipsoidal model was used to represent a normal left ventricular shape and deviations from the fit were used as features for discrimination. The method was tested on a dataset of 340 patients and 2,158 echocardiographic sequences depicting a variety of cardiac diseases in patients ranging from aneurysms (89), to dilated cardiomyopathy (76), hypertrophies (78) and normal LV size and function (448), etc. Of these, 503 sequences were 4-chamber views including about 138 sequences labeled as normal LV size and function from their corresponding reports. To discriminate between normal and abnormal LV, we used 40% of the normal and abnormal LV cases for training and the remaining for testing. A total of 25,020 feature vectors were generated from these sequences as they were of variable length in heart cycles and averaged about 64 images per sequence. We experimented with different kernels for learning with support vector machines and the best classification performance was obtained with radial basis functions as kernels. The class was decided at the level of the echocardiographic sequence by taking the majority vote from the classification of parametric features of individual images of the sequence within cardiac cycles. The results are summarized in Table 1 .

Accuracy of discrimination between normal and abnormal left ventricular shapes.

The quality of summaries generated by Medical Sieve was also evaluated in the domain of coronary angiography studies. In this case, the system prepares an automatic summary of an angiography study by retaining those images depicting the coronary arteries with good visibility as key images for interpretation. The vessel visibility is derived using a measure based on the automatic detection of arterial tubular structures in images. We conducted a study to compare the performance of automatic key frame selection with manually identified key frames by clinicians. This study was conducted over a database of 210 video sequences from 70 patients representing a total of 5,250 images. The system was able to reduce the data browsing load by 95% while still selecting the relevant keyframes within 1-2 frames of the ones chosen by clinicians during a comparison study with the clinicians. Comparison with summaries generated using Frangi filtering of angiography images showed superior performance by a margin of 26% [ 17 , 18 ].


The explosion of genomic information has created further complications for decision makers. Whole genomic sequencing means new opportunities to personalize healthcare. However, it also faces us with processing huge amounts of data and the need to better understand the role of DNA and its segments.

New technology has led to dramatic cost reductions in the price of sequencing a genome, even allowing for sequencing of entire organism genomes. Since 2008, the cost of sequencing per megabase has fallen by 5 orders of magnitude and the cost of full genome sequencing by 4 orders of magnitude [ 19 ]. Today, the National Center for Biotechnology Information nucleotide database provides a vast and growing collection of sequences from GenBank® ( NIH annotated collection of all publicly available DNA sequences), RefSeq (comprehensive, integrated, non-redundant, annotated set of reference sequences including genomic, transcript, and protein), Third Party Annotation (submitter-provided annotation for sequence data derived from GenBank), and Protein Data Bank (repository for three-dimensional structural data of large biological molecules, such as proteins and nucleic acids) and other sources including approximately 4,000 complete bacterial genomes and 40,000 viruses [ 20 ]. Microbes compose the largest fraction of species on Earth. They are integral to the health of humans and the environment. Microbial biodiversity affects the health of ecosystems (e.g. marine estuaries), keystone species (e.g. honeybees) and individual human patients. The growing library of reference genetic data affords the opportunity to identify microorganisms from metagenomic samples of amino acid, DNA, and RNA sequences, by comparing primary sequence data to online reference sources. Today this can be accomplished with genome available software such as the Basic Local Alignment Search Tool (BLAST) [ 21 ]. Genome alignment with tools such as BLAST is an efficient approach to characterizing transcript sequences of previously known and sequenced organisms. However, the method is disadvantaged by its inability to account for phenotypic variations such as alternative splicing [ 22 ]. To account for the detection of new organisms as well as natural variations within different pheno-types, a range of complementary big data analytic tools will be required including de novo assemblers. A number of assemblers are available today when it is necessary to create a transcriptome without the aid of a reference genome [ 22 , 23 , 24 ].

We see the generation of vast amounts of genomic data on microbes and a scientific community struggling to manage its volume, variety, and velocity, and master the knowledge encoded within. New algorithms, computing methods, and data repositories are needed to “connect the dots” between genomic readouts of microbial diversity and personal/environmental health. To spread the benefit of metagenomic data analysis across disciplines, algorithms and data need to be available as computing software and services. Scientists are in the beginning stages of developing laboratory and computational procedures to apply metagenomic DNA sequencing for the benefit of healthcare, science, government, and industries utilizing traditional biological testing methods (healthcare, agriculture, environmental management) [ 24 , 25 , 26 ]. We plan to contribute to unlocking the potential of environmental sequencing through scientific discoveries in large-scale data repository management, large-scale bioinformatics analysis, correlation of genomic signals with traditional sensor networks and other signals (e.g. public healthcare ports) as well as cloud-based software and analytic services.

Evidence-based Case Structuring - Evicase

Once evidence is developed we need to be able to incorporate it into the decision process. Current methods, such as guidelines or protocols are beneficial, but have limitations. Clinical practice guidelines can be used to formulate the predefined rules for the reasoning engine of a clinical decision support system. Clinical practice guidelines are typically consensus and evidence-based guidance, for treating groups of patients defined by some fixed clinical criteria [ 27 ]. These guidelines are developed by medical specialties and sub-specialties (e.g., oncology or breast oncology), and are mostly based on a critical mass of controlled clinical trials and other comparative effectiveness studies that demonstrated value for specific interventions [ 28 ]. However, the efficacy and side effects of a new treatment cannot necessarily be generalized to different settings in the real clinical environment, as the studies are based on group outcomes and may not apply directly to an individual patient. The trials often do not account for differences among patients with the same disease, often due to lack of data, e.g., personalized genetic make-up of a patient [ 29 ]. Therefore, the next generation of CDS should utilize the latest biomedical discoveries [ 30 ] to alleviate the translational barriers from bench to bedside, while feeding back lessons learned by its users. Recent advancements follow this direction, for example, CancerLinQ (developed by the American Society of Clinical Oncology) is a “learning health system” designed to draw insight from the vast, untapped pool of data on “real world” patients [ 31 ]. A prototype based on 170,000 de-identified medical records of breast cancer patients provided by oncology practices around the United States, already allows data mining and visualization at the point of care for personalized CDS.

In order to improve the implementation of clinical guidelines for specific individuals (care processes), we need to refine established knowledge through data-driven insights by combining rule-based reasoning with case-based reasoning. To address this need, we suggested an evidence-based case structuring framework to generate multitiered statistical structures we call Evicase [ 32 ]. An Evicase object integrates established biomedical evidence with insights gained from analysis of patient cases in operational information systems of healthcare providers. Established evidence is refined through machine learning analysis of patient data, resulting in various means for clinicians to retrospectively analyze care processes and to prospectively answer questions regarding an individual patient.

In our implementation, the Evicase is a three-tiered structured object:

  • Tier-one: Knowledge encapsulation provides the guideline view for the specific patient’s presentation. The system’s knowledge management module interacts with relevant external resources and encapsulates clinical guidelines and evidences. We developed a set of ontologies, rules, and diffusion processes to effectively anchor the clinical domain knowledge into the Evicase and for generating tier-one.
  • Tier-two: Retrospective analysis incorporates insights that are generated from retrospectively analyzing the organization’s patient records and from monitoring and assessing its care processes along the established clinical guidelines and best practices. In particular, such algorithms may suggest rule-based patient-similarity metrics according to the guideline’s fixed criteria, as well as statistically-based patient-similarity measures that are refined and adapted to the care organization’s data. Such analysis enables applications such as guidelines adherence, outcomes assessment, and cost optimization.
  • Tier-three: Prospective analysis applies statistical analysis and machine learning algorithms to the patient records available at the care organization in order to reveal prospective clinical insights. Analyzing patients’ clinical data in the context of similar patients may provide prospective outcome assessment, which in turn can be used for treatment recommendations leading to improved patient outcome.

Designed as a stand-alone multi-tiered structure (combining knowledge and data), Evicase can be used for a range of decision support applications including guideline adherence monitoring and personalized prognostic predictions.

For example, an Evicase for analyzing the treatment of Soft Tissue Sarcoma patients has been developed in collaboration with a cancer center in Italy and consists of three tiers: (1) clinical practice guidelines used in that cancer center classified to standard, individualized, experimental, or deviated; (2) retrospective analyses of clinical records in the cancer center, resulting in patient groups based on similarity according to local guidelines as well as actual outcomes, using machine learning techniques; (3) probability prediction of each outcome for different possible treatments based on the historical outcomes observed among the group of the most similar patients [ 33 ].

The Evicase framework is designed to help physicians make informed decisions when literature-based knowledge is insufficient and textbook guidelines are imprecise. As such, Evicase objects could help clinicians increase effectiveness of treatments through the optimization of care processes for specific patient populations. It can also be exchanged with other providers, allowing comparative effectiveness research as well as bringing new business potential in the form of an Evicase open market.

Finally, Evicase might also be used to generate decision support aids for patients, which could provide clinical benefits as well as cost reduction for the individual patient.

Streaming Analytics

Big data characteristics include velocity and complexity. In some environments, data arrives quickly and in large amounts that either cannot be adequately stored or should be analyzed in near real-time because of the immediate nature of clinical decisions.

ICUs are data rich environments, where multiple streams of physiological data from sophisticated patient monitoring systems and ancillary devices are collected and interpreted by clinicians. While the outputs of these devices aim at improving patient care by signaling early warnings of complications, they are also creating an information overload problem. A 50 bed ICU may generate a quarter of a terabyte of data on a monthly basis. Despite containing a wealth of information, only a small subset of these data points are currently exploited for the delivery of care in modern ICUs. The rest is simply dropped after a few days. As a result, intensive care is often provided reactively, in response to adverse events buried in these large volumes of data and typically detected after the emergence of clinical symptoms, or after the interpretation of a clinical test. An opportunity for an earlier, simpler intervention that could avoid a serious problem may be lost.

The management and analysis of these data points is a big data challenge that has the potential to make critical care much more proactive. There are two classes of such problems that we have addressed in our research. For many patients, complications are presaged by signs buried in patient data streams, but with well understood patterns. These complications include hospital acquired infections, as well as respiratory, cardiac, and neurological events. One notable example of an early warning pattern is the reduction of heart rate variability that is known to be associated with early stages of sepsis [ 34 , 35 ]. For other complications, the specific signature of early signs in physiological data streams is unknown and subject to research. In this case, mining large historical patient-related data sets could lead to the discovery of new early detection patterns. Our research has led us to address both of these classes of problems.

A) Real-Time Analysis for ICU Patient Data Streams

The detection of known patterns in patient monitoring data for the early detection of complications in ICUs is a real-time analytical problem requiring systems able to analyze in a timely fashion structured and unstructured data points produced at large rates. Just like a gold miner setting up filters on a river to extract gold nuggets, big data analysts use a stream computing paradigm to design filtering analytics able to extract nuggets of information from flows of patient data. We have developed at IBM Research the Online Healthcare Analytics (OHA) infrastructure, also known as Artemis [ 34 ], which is a programmable framework for real-time analysis of intensive care sensor data leveraging IBM InfoSphere Streams (Streams), a real-time high-performance stream analysis engine. Streams provides a programming and runtime environment, where analytic developers within medical institutions can develop and deploy real-time analytics on large flows of structured and unstructured data. OHA also leverages different time series, machine learning and data mining technologies in the form of analytical toolkits to facilitate the authoring of complex real-time applications. OHA interfaces Streams with an open set of data collection systems (e.g., Excel Medical Electronics BedMasterEX system, the CapsuleTech data collection system). Although these tools are designed to be intuitive, it still requires some training and commitment to use them effectively.

While the successful use of custom real-time analytic solutions built for the monitoring of specific conditions has been well documented in the literature [ 35 , 36 ], the OHA platform differs significantly from these systems with its programmability and agility. With OHA, analysts within medical institutions are able to compose and deploy an open set of analytics, tailored to address their goals. As they discover new real-time analytics that they would like to deploy, analysts using OHA do not have to rebuild custom solutions bringing these analytics to the bed side. Instead, they can simply deploy these analytics on the OHA platform. The extensibility of the OHA programming model facilitates the inclusion of analytics written in several common languages ranging from high level languages such R and Matlab to lower level languages like Python or even Java, C++ and C.

Different versions of the OHA system are currently in use in live ICU environments under research agreements, in several types of intensive care ranging from neonatal ICUs [ 34 , 37 ] to neurological ICUs [ 38 ]. Different real-time analytics have been deployed on OHA, including heart rate variability analytics aiming at modeling the autonomic nervous system response as a way to detect early signs of inflammatory responses [ 37 ], seizure detection analytics on electroencephalograms [ 39 ] and analytics monitoring the intra-cranial pressure auto-regulation in neuro-ICU settings [ 38 ].

B) Mining Patient Monitoring Data for the Discovery of Early Detection Patterns

The discovery of new interesting patterns in patient monitoring data is intrinsically an “at rest” data analysis problem requiring systems able to analyze large amounts of historical data sets. We have been using an array of offline analytical platforms such as Weka, R, SPSS, and big data platforms like Hadoop for the mining of large volumes of data. We have created applications of Weka analytics to build models able to predict secondary complication in neuro-ICUs [ 40 ]. Patient similarity concepts learned on historical data may allow physicians to make clinical decisions leveraging experiences gathered from data from similar patients observed in the past [ 41 ]. An in-silico research study using physiological sensor data streams from 1,500 ICU patients obtained from physionet [ 42 ] shows that these similarity constructs may be used to forecast the trajectory of blood pressure streams and help predict adverse ICU events such as acute hypotensive episodes.

One of the obstacles to achieving a personalized, evidence-supported future for healthcare is the effective use of the myriad and voluminous data that surrounds us. We have to be able to acquire and analyze huge amounts of often conflicting historical and research data and turn it into actionable information delivered to the decision makers. However, analytic tools are not a panacea for the problems in healthcare. They offer nothing in isolation, only in the context of a commitment to change. Healthcare professionals, patient advocacy groups, policy analysts and economists have all described various paths and challenges for the desired future. Information technology cannot drive change. Healthcare stakeholders must desire and plan for the transformation of healthcare. Once the strategies for transformation are developed, obstacles can be identified. Then, and only then, can technology be an enabler by helping overcome the obstacles.

An additional limitation to the role of analytics is the availability and quality of information. For example, Watson cannot process or use information that isn’t published or available. Publication bias, the tendency to publish studies that are positive or are statistically significant, is a recognized phenomenon [ 43 , 44 ]. Watson is limited by publication bias just as a clinician would be, but it may be able to mitigate publication bias because of the large volume of articles it can review. All of the tools described depend to some extent on machine learning. Machine learning works well when there is enough training data to cover all of the features used within the machine learning model. The challenge in almost every case is that there is less training data than we would like, so we have to compensate with other clustering and conditioning techniques (based on subject matter expert knowledge) to get to the level of accuracy and precision required in the medical domain.

The variability or uncertainty that is inherent in big data represents another limitation. Published articles can be contradictory or flawed. Data in EHRs can be inconsistent or erroneous. The need to compensate for data limitations is one of the reasons that all these tools thrive on more data. More data gives them more opportunity to identify and compensate for the flaws. The necessity of managing such conflicted and inconsistent data is what mandates cognitive computing.

Not all decision support requires big data, but big data techniques allow us to incorporate more information when it is helpful. Big data has inherent limitations. The process of looking for patterns in big data will yield a large number of statistical associations. However, many of them will be inconsequential with no discernible causal relationship to the outcome being studied. The number of meaningful relationships may be orders of magnitude smaller. Evaluation and feedback from domain experts can help address this problem by helping identifying the meaningful relationships [ 45 ]. The hype surrounding big data, creating unachievable expectations, is a problem in itself [ 46 ].

Clinical decision support is only valuable if it is used. There are reports that physicians tend to disregard or not use decision support systems, perhaps from a failure of metacognition, the willingness to assess one’s thought process and assumptions [ 47 ] We can only expect wide spread use of decision support tool if it provides clear value. The evidence that decision support systems have improved outcomes at this point is limited [ 48 , 49 ]. Any new techniques need to prove their value in the clinical world.

We have described an array of analytic and clinical decision support tools IBM has designed to help enable evidence-supported decisions. We have shown that computer resources have been or are being developed to use the different kinds of healthcare information, big data, more effectively. Decision support is one component of the broad-based effort necessary to transform healthcare to improve outcomes and control costs.

White and Blue Background

Welcome to the age of value creation with AI. But a technology is only as valuable as the ecosystem it enables. Enterprises seeking to incorporate generative AI into their workflows face challenges when it comes to inferencing costs, trustworthiness of AI, energy efficiency, portability, and being able to use enterprise data effectively and securely. Addressing these challenges requires not only innovation but also the right type of AI business strategy—one built on the collective power of an open, healthy AI community. Join Darío Gil, IBM SVP and Director of Research, to learn about the latest technologies from IBM designed to scale enterprise AI and unpack what a winning AI business strategy looks like.

o   Dr. Darío Gil, Senior Vice President and Director of Research, IBM

Explore more on demand keynotes from Think 2024

Explore research-based, data driven insights to help make smarter decisions and more informed technology investments. Learn about emerging trends, challenges and strategies within the realm of tech advancements and their implications for business environments.

Learn about the next generation enterprise studio for AI builders to train, validate, tune and deploy AI models.

Have 5 minutes? Get the latest into future technology.

Take five minutes to stay ahead of the future breakthroughs set to reshape entire industries.

Recommended for you

Watch AI Academy, a new flagship AI for business educational experience. Gain insights from top IBM thought leaders on effectively prioritizing the AI investments that can drive growth, through a course designed for business leaders like you.

Elevate your understanding with expert-led educational videos and YouTube playlists on the biggest topics and trends in tech. Learn the basics, enhance your expertise, or acquire real-world strategies for using technology.

  • United States
  • United Kingdom

IBM previews watsonx Code Assistant for Enterprise Java Applications

The generative ai-based code assistant is aimed at accelerating java application lifecycle with capabilities, such as code generation, code explanation, and test generation..

Anirban Ghoshal

Senior Writer, InfoWorld |

IBM logo on building

IBM has previewed its upcoming watsonx Code Assistant for Enterprise Java Applications at its annual Think conference.

The generative AI -based code assistant is aimed at accelerating Java application lifecycle with capabilities such as code generation, code explanation, and test generation, underpinned by IBM’s open source Granite family of large language models (LLMs) .

Some of the capabilities of the new code assistant, targeted at improving developer productivity, includes navigating complex code structures with the help of generative AI to summarize an application’s key functions, services, and dependencies.

The assistant can also describe the changes needed to upgrade, modernize or enhance an application, with a detailed assessment of complexity and required development effort, the company said.

Additionally, the code assistant can be used to implement code and configuration changes while also documenting them, it said.

Enterprises can also use the assistant to import existing unit tests easily and use generative AI to create new tests that help maintain critical application functions.

Last August, IBM expanded the capabilities of its code assistant to include COBOL code translation into Java in order to help  IBM Z systems  customers modernize their applications.

IBM has a separate watsonx Code Assistant for generating content for its Red Hat Ansible Automation Platform. This is expected to enable developers to write Ansible Playbooks with AI-generated recommendations, the company said.

Next read this:

  • Why companies are leaving the cloud
  • 5 easy ways to run an LLM locally
  • Coding with AI: Tips and best practices from developers
  • Meet Zig: The modern alternative to C
  • What is generative AI? Artificial intelligence that creates
  • The best open source software of 2023
  • Generative AI
  • Development Tools

Anirban Ghoshal is a senior writer covering enterprise software for and databases, cloud, and AI for InfoWorld.

Copyright © 2024 IDG Communications, Inc.

ibm watson research articles


  1. IBM Watson: a milestone in artificial intelligence

    ibm watson research articles

  2. IBM Watson's Startling Cancer Coup

    ibm watson research articles

  3. Conoce a Watson de la empresa IBM

    ibm watson research articles

  4. IBM Watson Powers Virtual Assistant For Arthritis Research UK

    ibm watson research articles

  5. IBM Watson Analytics Delivers Deep Insights Guiding Innovation

    ibm watson research articles

  6. Unlock the untapped with IBM Watson Discovery

    ibm watson research articles


  1. Practical-II IBM Watson

  2. IBM watsonx at the 2024 Masters


  1. Review Article IBM Watson: How Cognitive Computing Can Be Applied to Big Data Challenges in Life Sciences Research

    Basic science, clinical research, and clinical practice generate big data. From the basic science of genetics, proteomics, and metabolomics to clinical research and real-world studies, these data can be used to support the discovery of novel therapeutics. 1, 2 This article reviews a cognitive technology called IBM Watson and describes early pilot projects.

  2. What Ever Happened to IBM's Watson?

    Mr. Ferrucci first pitched the idea of Watson to his bosses at IBM's research labs in 2006. He thought building a computer to tackle a question-answer game could push science ahead in the A.I ...

  3. "A Tool, Not a Crutch": Patient Perspectives About IBM Watson for

    IBM Watson for Oncology trained by Memorial Sloan Kettering (WFO; MSK) is a clinical decision support tool developed in collaboration between IBM and MSK to harness the power of cognitive technology to provide clinicians with informed treatment recommendations for patients with cancer. 1 WFO has been trained by MSK clinicians, using their ...

  4. A meta-analysis of Watson for Oncology in clinical application

    We systematically searched articles about the clinical applications of Watson for Oncology in the databases and conducted meta-analysis using RevMan 5.3 software. A total of 9 studies were ...

  5. Concordance Study Between IBM Watson for Oncology and Real Clinical

    The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. ... Concordance study between IBM Watson for oncology and clinical practice for patients with cancer in China. Oncologist 24 812-819. [PMC free article] ...

  6. PDF "The Rise, Fall, and Resurrection of IBM Watson Health"

    IBM did not stop there either. After Watson won . Jeopardy in 2011, IBM has been engaged in multiple explorations of how to best apply Watson. 10 (Figure 1 illustrates critical events of Watson and interest of the public towards Watson over the period 201 0 to 2020). IBM soon turned to healthcare as the best place to plant its flag. Figure 1.

  7. Practising Value Innovation through Artificial Intelligence: The IBM

    In this article, we investigate how the adoption of IBM Watson affects the practices of value innovation. As Mele, Colurcio, and Russo-Spena (2014) address, value innovation is 'the development of new competencies or a new combination of existing competencies for the provision of new or increased benefits to one or more parties' (p. 204). ). According to this understanding, actors engaged ...

  8. IBM Watson

    IBM Watson IBM Research started working on the grand challenge of building a computer system that could compete with champions at the game of Jeopardy!. Just four years later in 2011, the open-domain question-answering system dubbed Watson beat the two highest ranked players in a nationally televised two-game Jeopardy! match.

  9. MIT-IBM Watson AI Lab Releases Groundbreaking Research on AI and the

    ARMONK, N.Y., Oct. 30, 2019 /PRNewswire/ -- IBM (NYSE: IBM) believes 100% of jobs will eventually change due to artificial intelligence, and new empirical research released today from the MIT-IBM Watson AI Lab reveals how.The research, The Future of Work: How New Technologies Are Transforming Tasks, used advanced machine learning techniques to analyze 170 million online job postings in the ...

  10. How IBM Watson Overpromised and Underdelivered on AI Health Care

    But we're doing it," says John E. Kelly III, IBM senior vice president for cognitive solutions and IBM research. Kelly has guided the Watson effort since the Jeopardy! days, and in late 2018 ...

  11. IBM Launches New Watson Capabilities to Help Businesses Build

    Watson has evolved from an IBM Research project, to experimentation, to a scaled, open set of products that run anywhere. With more than 40,000 client engagements, Watson is being applied by leading global brands across a variety of industries to transform how people work.

  12. A Survey on IBM Watson and Its Services

    [1] Cecil R. R. and Soares J. 2019 Pharmaceutical Supply ChainsMedicines Shortages (Cham: Springer) IBM Watson studio: a platform to transform data to intelligence 183-192 Google Scholar [2] Ferrucci D. A. 2012 Introduction to "this is watson" IBM Journal of Research and Development 56 1-1 Google Scholar [3] High R. 2012 The era of cognitive systems: An inside look at IBM Watson and how it ...

  13. Using Artificial Intelligence (Watson for Oncology) for Treatment

    IBM's Watson for Oncology (WFO, IBM Corporation, United States), ... There is limited research on WFO [9,11-13]; we were the first to report on unsupported cases, and the sample size of this lung cancer study was the largest among all lung cancer studies performed in China. We not only reported the consistency of the recommendations from WFO ...

  14. Artificial Intelligence

    With over 3,000 researchers across the globe, IBM Research has a long pedigree of turning fundamental research into world-altering technology. Learn more about the ways that we collaborate with businesses and organizations across the globe to help solve their most pressing needs faster. MIT-IBM Watson AI Lab; AI Hardware Center

  15. Publications

    This is our catalog of publications authored by IBM researchers, in collaboration with the global research community. We're currently adding our back catalog of more than 110,000 publications. It's an ever-growing body of work that shows why IBM is one of the most important contributors to modern computing. Filter by.

  16. Introduction to "This is Watson" for IBM J. Res. Dev

    Abstract. In 2007, IBM Research took on the grand challenge of building a computer system that could compete with champions at the game of Jeopardy!™. In 2011, the open-domain question-answering (QA) system, dubbed Watson, beat the two highest ranked players in a nationally televised two-game Jeopardy! match. This paper provides a brief ...

  17. How IBM's Watson went from the future of health care to sold off for parts

    To do that, IBM needed massive amounts of data on which to train Watson. It got that data through acquisitions, eventually spending some $5 billion buying a series of health data companies.

  18. "A Tool, Not a Crutch": Patient Perspectives About IBM Watson for

    PURPOSE: IBM Watson for Oncology trained by Memorial Sloan Kettering (WFO) is a clinical decision support tool designed to assist physicians in choosing therapies for patients with cancer. Although substantial technical and clinical expertise has guided the development of WFO, patients' perspectives of this technology have not been examined. To facilitate the optimal delivery and ...

  19. IBM and Pfizer to Accelerate Immuno-oncology Research with Watson for

    ARMONK, N.Y. and NEW YORK, Dec. 1, 2016 /PRNewswire/ -- IBM (NYSE: IBM) Watson Health and Pfizer Inc. (NYSE: PFE) today announced a collaboration that will utilize IBM Watson for Drug Discovery to help accelerate Pfizer's research in immuno-oncology, an approach to cancer treatment that uses the body's immune system to help fight cancer.

  20. IBM and MIT to pursue joint research in artificial intelligence

    IBM and MIT today announced that IBM plans to make a 10-year, $240 million investment to create the MIT-IBM Watson AI Lab in partnership with MIT. The lab will carry out fundamental artificial intelligence (AI) research and seek to propel scientific breakthroughs that unlock the potential of AI.

  21. IBM Watson

    IBM Watson is a computer system capable of answering questions posed in natural language. It was developed as a part of IBM's DeepQA project by a research team, led by principal investigator David Ferrucci. Watson was named after IBM's founder and first CEO, industrialist Thomas J. Watson.. The computer system was initially developed to answer questions on the popular quiz show Jeopardy!

  22. Introduction to "This is Watson"

    In 2007, IBM Research took on the grand challenge of building a computer system that could compete with champions at the game of Jeopardy!™. In 2011, the open-domain question-answering (QA) system, dubbed Watson, beat the two highest ranked players in a nationally televised two-game Jeopardy! match. This paper provides a brief history of the events and ideas that positioned our team to take ...

  23. IBM Extends Reach Of Watsonx With Products And Partnerships

    In addition to IBM's own models, the company also announced partnerships with AWS, Meta, Adobe, Microsoft, Mistral, Salesforce, and SAP among others to integrate and or co-develop new models for ...

  24. IBM's Health Analytics and Clinical Decision Support

    Watson was originally developed to prove that a computer could understand natural language, the language of communication, generate and evaluate hypotheses, and adapt and learn with interaction, outcomes and new information. ... IBM Research is developing a new radiologist cognitive assistant called the Medical Sieve, which is an image-guided ...

  25. IBM Expands watsonx Portfolio on AWS, Adds watsonx.governance to Help

    IBM Consulting is further expanding its generative AI expertise focused on AWS to help joint customers operationalize responsible AI. With the integration of watsonx.governance and Amazon ...

  26. IBM Expands watsonx Portfolio on AWS, Adds watsonx.governance to Help

    The companies plan to integrate IBM watsonx.governance and Amazon SageMaker-- a service to build, train, and deploy machine learning (ML) and generative AI models with fully managed infrastructure, tools, and workflows -- to help Amazon SageMaker and watsonx customers manage model risk and support compliance obligations in connection with recent regulatory requirements such as the EU AI Act.

  27. Think 2024 On Demand

    Think 2024 Keynote Watch now. Unlock the power of open AI. Welcome to the age of value creation with AI. But a technology is only as valuable as the ecosystem it enables. Enterprises seeking to incorporate generative AI into their workflows face challenges when it comes to inferencing costs, trustworthiness of AI, energy efficiency, portability ...

  28. What is Natural Language Processing? Definition and Examples

    Natural language processing (NLP) is a subset of artificial intelligence, computer science, and linguistics focused on making human communication, such as speech and text, comprehensible to computers. NLP is used in a wide variety of everyday products and services. Some of the most common ways NLP is used are through voice-activated digital ...

  29. IBM previews watsonx Code Assistant for Enterprise Java ...

    IBM has a separate watsonx Code Assistant for generating content for its Red Hat Ansible Automation Platform. This is expected to enable developers to write Ansible Playbooks with AI-generated ...

  30. Digital health: a case of mistaken identity

    So did IBM, with Dr. Watson ... The International Society for Pharmacoeconomics and Outcomes Research is a global nonprofit dedicated to advancing the understanding of health economic outcomes.