skip to main content
10.1145/41526.41543acmconferencesArticle/Chapter ViewAbstractPublication PageshmiConference Proceedingsconference-collections
Article
Free Access

The background of INTERNIST I and QMR

Published:01 December 1987Publication History

ABSTRACT

During my tenure as Chairman of the Department of Medicine at the University of Pittsburgh, 1955 to 1970, two points became clear in regard to diagnosis in internal medicine. The first was that the knowledge base in that field had become vastly too large for any single person to encompass it. The second point was that the busy practitioner, even though he knew the items of information pertinent to his patients correct diagnosis, often did not consider the right answer particularly if the diagnosis was an unusual disease.

I resigned the position of Chairman in 1970 intending to resume my position as Professor of Medicine. However, the University saw fit to offer me the appointment as University Professor (Medicine). The University of Pittsburgh follows the practice of Harvard University, established by President James Bryant Conant in the late 1930s, in which a University Professor is a professor at large and reports only to the president of the university. He has no department, no school and is not under administrative supervision by a dean or vice-president. Thus the position allows maximal academic freedom.

In this new position I felt strongly that I should conduct worthwhile research. It was almost fifteen years since I had worked in my chosen field of clinical investigation, namely splanchnic blood flow and metabolism, and I felt that research in that area had passed me by. Remembering the two points mentioned earlier — the excessive knowledge base of internal medicine and the problem of considering the correct diagnosis — I asked myself what could be done to correct these problems. It seemed that the computer with its huge memory could correct the first and I wondered if it could not help as well with the second.

At that point I knew no more about computers than the average layman so I sought advice. Dr. Gerhard Werner, our Chairman of Pharmacology, was working with computers in an attempt to map all of the neurological centers of the human brain stem with particular reference to their interconnections and functions. He was particularly concerned about the actions of pharmacological agents on this complex system. Working with him on this problem was Dr. Harry Pople, a computer scientist with special interest in “artificial intelligence”. The problem chosen was so complex and difficult that Werner and Pople were making little progress.

Gerhard listened patiently to my ideas and promptly stated that he thought the projects were feasible utilizing the computer. In regard to the diagnostic component of my ambition he strongly advised that “artificial intelligence” be used. Pople was brought into the discussion and was greatly interested, I believe because of the feasibility of the project and the recognition of its practical application to the practice of medicine.

The upshot was that Pople joined me in my project and Werner and Pople abandoned the work on the brain stem. Pople knew nothing about medicine and I knew nothing about computer science. Thus the first step in our collaboration was my analysis for Pople of the diagnostic process. I chose a goodly number of actual cases from clinical pathological conferences (CPCs) because they contained ample clinical data and because the correct diagnoses were known. At each small step of the way through the diagnostic process I was required to explain what the clinical information meant in context and my reasons for considering certain diagnoses. This provided to Pople insight into the diagnostic process. After analyzing dozens of such cases I felt as though I had undergone a sort of “psychoanalysis”.

From this experience Pople wrote the first computer diagnostic programs seeking to emulate my diagnostic process. This has led certain “wags” to nickname our project “Jack in the box”. For this initial attempt Pople used the LISP computer language. We were granted access to the PROPHET PDP-10, a time-sharing mainframe maintained in Boston by the National Institutes of Health (NIH) but devoted particularly to pharmacological research. Thus we were interlopers.

The first name we applied to our project was DIALOG, for diagnostic logic, but this had to be dropped because the name was in conflict with a computer program already on the market and copyrighted. The next name chosen was INTERNIST for obvious reason. However, the American Society for Internal Medicine publishes a journal entitled “The Internist” and they objected to our use of INTERNIST although there seems to be little relationship or conflict between a printed journal and a computer software program. Rather than fight the issue we simply added the Roman numeral one to our title which then became INTERNIST-I, which continues to this day.

Pople's initial effort was unsuccessful, however. He diligently had incorporated details regarding anatomy and much basic pathophysiology, I believe because in my initial CPC analyses I had brought into consideration such items of information so that Pople could understand how I got from A to B etc. The diagnostician in internal medicine knows, of course, much anatomy and patho-physiology but these are brought into consideration in only a minority of diagnostic problems. He knows, for example, that the liver is in the right upper quadrant and just beneath the right leaf of the diaphragm. In most diagnostic instances this information is “subconscious”.

Our first computer diagnostic program included too many such details and as a result was very slow and frequently got into analytical “loops” from which it could not extricate itself. We decided that we had to simplify the program but by that juncture much of 1971 had passed on.

The new program was INTERNIST-I and even today most of the basic structure devised in 1972 remains intact. INTERNIST-I is written in INTERLISP and has operated on the PDP-10 and the DEC 2060. It has also been adapted to the VAX 780. Certain younger people have contributed significantly to the program, particularly Dr. Zachary Moraitis and Dr. Randolph Miller. The latter interrupted his regular medical school education to spend the year 1974-75 as a fellow in our laboratory and since finishing his formal medical education in 1979 has been active as a full time faculty member of the team. Several Ph.D. candidates in computer science have also made significant contributions as have dozens of medical students during electives on the project.

INTERNIST-I is really quite a simple system as far as its operating system or inference engine is concerned. Three basic numbers are concerned in and manipulated in the ranking of elicited disease hypotheses. The first of these is the importance (IMPORT) of each of the more than 4,100 manifestations of disease which are contained in the knowledge base. IMPORTS are a global representation of the clinical importance of a given finding graded from 1 to 5, the latter being maximal, focusing on how necessary it is to explain the manifestation regardless of the final diagnosis. Thus massive splenomegaly has an IMPORT of 5 whereas anorexia has an IMPORT of 1. Mathematical weights are assigned to IMPORT numbers on a non-linear scale.

The second basic number is the evoking strength (EVOKS), the numbers ranging from 0 to 5. The number answers the question, that given a particular manifestation of disease, how strongly does one consider disease A versus all other diagnostic possibilities in a clinical situation. A zero indicates that a particular clinical manifestation is non-specific, i.e. so widely spread among diseases that the above question cannot be answered usefully. Again, anorexia is a good example of a non-specific manifestation. The EVOKS number 5, on the other hand, indicates that a manifestation is essentially pathognomonic for a particular disease.

The third basic number is the frequency (FREQ) which answers the question that given a particular disease what is the frequency or incidence of occurrence of a particular clinical finding. FREQ numbers range from 1 to 5, one indicating that the finding is rare or unusual in the disease and 5 indicating that the finding is present in essentially all instances of the disease.

Each diagnosis which is evoked is ranked mathematically on the basis of support for it, both positive and negative. Like the import number, the values for EVOKS and FREQ numbers increase in a non-linear fashion. The establishment or conclusion of a diagnosis is not based on any absolute score, as in Bayesian systems, but on how much better is the support of diagnosis A as compared to its nearest competitor. This difference is anchored to the value of an EVOKS of 5, a pathognomonic finding.

When the list of evoked diagnoses is ranked mathematically on the basis of EVOKS, FREQ and IMPORT, the list is partitioned based upon the similarity of support for individual diagnoses. Thus a heart disease is compared with other heart diseases and not brain diseases since the patient may have a heart disorder and a brain disease concommitantly. Thus apples are compared with apples and not oranges.

When a diagnosis is concluded, the computer consults a list of interrelationships among diseases (LINKS) and bonuses are awarded, again in a non-linear fashion for numbers ranging from 1 to 5 — 1 indicating a weak interrelationship and 5 a universal interrelationship. Thus multiple interrelated diagnoses are preferred over independent ones provided the support for the second and other diagnoses is adequate. Good clinicians use this same rule of thumb. LINKS are of various types: PCED is used when disease A precedes disease B, e.g. acute rheumatic fever precedes early rheumatic valvular disease; PDIS - disease A predisposes to disease B, e.g. AIDS predisposes to pneumocystis pneumonia; CAUS - disease A causes disease B, e.g. thrombophlebitis of the lower extremities may cause pulmonary embolism; and COIN - there is a statistical interrelationship between disease A and disease B but scientific medical information is not explicit on the relationship, e.g. Hashimoto's thyroiditis coincides with pernicious anemia, both so called autoimmune diseases.

The maximal number of correct diagnoses made in a single case analysis is, to my recollection, eleven.

In working with INTERNIST-I during the remainder of the 1970s several important points about the system were learned or appreciated.

The first and foremost of these is the importance of a complete and accurate knowledge base. Omissions from a disease profile can be particularly troublesome. If a manifestation of disease is not listed on a disease profile the computer can only conclude that that manifestation does not occur in the disease, and if a patient demonstrates the particular manifestation it counts against the diagnosis. Fortunately, repeated exercise of the diagnostic system brings to attention many inadvertent omissions. It is important to establish the EVOKS and FREQ numbers as accurately as possible. Continual updating of the knowledge base, including newly described diseases and new information about diseases previously profiled, is critical. Dr. Edward Feigenbaum recognized the importance of the accuracy and completeness of knowledge bases as the prime requisite of expert systems of any sort. He emphasized this point in his keynote address to MEDINFO-86 (1).

Standardized, clear and explicit nomenclature is required in expressing disease names and particularly in naming the thousands of individual manifestations of disease. Such rigidity can make the use of INTERNIST-I difficult for the uninitiated user. Therefore, in QMR more latitude and guidance is provided the user. For example, the user of INTERNIST-I must enter ABDOMEN PAIN RIGHT UPPER QUADRANT exactly whereas in QMR the user may enter PAI ABD RUQ and the system recognizes the term as above.

The importance of “properties” attached to the great majority of clinical manifestations was solidly evident. Properties express such conditions that if A is true then B is automatically false (or true as the case may be). The properties also allow credit to be awarded for or against B as the case may be. Properties also provide order to the asking of questions in the interrogative mode. They also state prerequisites and unrequisites for various procedures. As examples, one generally does not perform a superficial lymph node biopsy unless lymph nodes are enlarged (prerequisite). Similarly, a percutaneous liver biopsy is inadvisable if the blood platelets are less than 50,000 (unrequisite).

It became clear quite early in the utilization of INTERNIST-I that systemic or multisystem diseases had an advantage versus localized disorders in diagnosis. This is because systemic diseases have very long and more inclusive manifestation lists. It became necessary, therefore, to subdivide systemic diseases into various components when appropriate. Systemic lupus erythematosus provides a good example. Lupus nephritis must be compared in our system with other renal diseases and such comparison is allowed by our partitioning algorithm. Likewise, cerebral lupus must be differentiated from other central nervous system disorders. Furthermore, either renal lupus or cerebral lupus can occur at times without significant clinical evidence of other systemic involvement. In order to reassemble the components of a systemic disease we devised the systemic LINK (SYST) which expresses the interrelationship of each subcomponent to the parent systemic disease.

It became apparent quite early that expert systems like INTERNIST do not deal with the time axis of a disease well at all, and this seems to be generally true of expert systems in “artificial intelligence”. Certain parameters dealing with time can be expressed by devising particular manifestations, e.g. a blood transfusion preceding the development of acute hepatitis B by 2 to 6 months. But time remains a problem which is yet to be solved satisfactorily including QMR.

It has been clearly apparent over the years that both the knowledge base and the diagnostic consultant programs of both INTERNIST-I and QMR have considerable educational value. The disease profiles, the list of diseases in which a given clinical manifestation occurs (ordered by EVOKS and FREQ), and the interconnections among diseases (LINKS) provide a quick and ready means of acquiring at least orienting clinical information. Such has proved useful not only to medical students and residents but to clinical practitioners as well. In the interrogative mode of the diagnostic systems the student will frequently ask “Why was that question asked?” An instructor can either provide insight or ready consultation of the knowledge base by the student will provide a simple semi-quantitative reason for the question.

Lastly, let the author state that working with INTERNIST-I and QMR over the years seems to have had real influence on his own diagnostic approaches and habits. Thus my original psycho-analysis when working with Pople has been reinforced.

References

  1. 1.Feigenbaum EA. "Autoknowledge: From File Servers to Knowledge Servers. IN: MEDINF0-86, R. Solamon, B. Blum, M. J~rgensen (Eds.) Elsevier Science Publishers B.V. (North Holland), pp. XLIII-XLVI~ 1986.Google ScholarGoogle Scholar
  2. 2.Miller RA, Pople HE Jr, ~ers JD. INTERNIST-i, An Experimental Computer-based Diagnostic Consultant for General Internal Medicine. N Engl Med. 307:468-476, 1982.Google ScholarGoogle ScholarCross RefCross Ref
  3. 3.Masarie FE Jr, Miller RA, Myers JD. INTERN- IST-I PROPERTIES: Representing Common Sense and Good Medical Practice in a Computerized- Medical Knowledge Base. Comp Bio Res. 18: 458-479, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  4. 4.Miller RA, McNeil MA, Challinor S, Masarie FE Jr, Myers JD. Status Report: The INTERN- IST-I/Quick Medical Reference Project. West J Med. 145:816-822, 1986.Google ScholarGoogle Scholar

Index Terms

  1. The background of INTERNIST I and QMR

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          HMI '87: Proceedings of ACM conference on History of medical informatics
          December 1987
          206 pages
          ISBN:0897912489
          DOI:10.1145/41526

          Copyright © 1987 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 December 1987

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader