<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Recap of Early Work on Theory and Knowledge Refinement</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Raymond J. Mooney</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jude W. Shavlik</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Computer Science, University of Texas at Austin</institution>
          ,
          <addr-line>2317 Speedway, Stop D9500 Austin, Texas 78712-1757</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dept. of Computer Science, University of Wisconsin - Madison</institution>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1996</year>
      </pub-date>
      <abstract>
        <p>A variety of research on theory and knowledge refinement that integrated knowledge engineering and machine learning was conducted in the 1990's. This work developed a variety of techniques for taking engineered knowledge in the form of propositional or first-order logical rule bases and revising them to fit empirical data using symbolic, probabilistic, and/or neural-network learning methods. We review this work to provide historical context for expanding these techniques to integrate modern knowledge engineering and machine learning methods.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Theory Refinement</kwd>
        <kwd>Knowledge Refinement</kwd>
        <kwd>Knowledge-Based Neural Networks</kwd>
        <kwd>Explainable AI</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Combining machine learning (ML) and knowledge engineering (KE) is not a new topic. In the
1990’s, there was community of researchers (including the authors) who developed a variety
of techniques for taking human-engineered knowledge in the form of propositional or
firstorder logical rule bases and revising them to fit empirical data using symbolic, probabilistic,
and/or neural-network learning methods. Although this work never achieved the substantial
lasting impact of some other research of this era, and may not be familiar to many current
researchers in machine learning and knowledge engineering, we believe it explored a range
of interesting algorithmic and experimental ideas and provides important historical context
for any new work on combining ML and KE. It also clearly demonstrated through a range
of experimental evaluations in a number of domains, that combining human-engineered and
empirically induced knowledge could improve the accuracy of a final intelligent system.</p>
      <p>The primary goal of this community was to gain better accuracy than either (a) solely using
engineered knowledge for the task at hand in a non-learning manner (recall the 1990’s were
the tail end of the “expert systems” era) or (b) solely learning a system from labeled training
examples, where the only role of domain knowledge was choosing good ’features’ with which
to represent examples.</p>
      <p>Figure 1 illustrates this idea. The X axis is the amount of training data and the Y axis is the
system’s error rate on novel examples not used during training. The use of domain knowledge
provides an error reduction, especially when the number of training examples is small. The
cross-over points in the figure show where learning approaches start to exceed non-learning
ones, and are indicative of the central role of machine learning in today’s AI. In Figure 1, the
curve for the non-learning approach is flat since it ignores training examples (though
presumably humans did use a few examples to create and represent the domain knowledge). The
knowledge-refinement approach starts at a higher error rate to reflect the fact the
knowledgerefinement approach may use a more limited knowledge representation than the non-learning
approach.</p>
      <p>This paper briefly reviews this early work, covering methods that primarily employed
logical, probabilistic, and neural-network methods. We believe many of the ideas in this work
could be updated and modernized to develop new, efective methods for combining ML and
KE. Therefore, we hope that reviewing this prior work serves a valuable resource for current
researchers interested in this area.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Logical Theory and Knowledge Refinement</title>
      <p>A number of systems have integrated KE and ML by using learning methods to revise a
humanengineered logical knowledge base (KB) in order to make it fit empirical data. Most of this
work employed a rule-based KB, either in propositional logic or in the form of first-order Horn
clauses (i.e. Prolog programs). Engineered knowledge was refined by removing conditions from
rules to generalize them, adding learned conditions to specialize them, removing rules, and/or
learning new rules from constructed subsets of data.</p>
      <p>
        Early work on this thread was by Ginsberg et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], which was followed up by a system
called RTLS [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. RTLS flattened a propositional rule base into disjunctive normal form (DNF),
revised this DNF to fit labeled training data using learning methods, and then translated the
changes back to the multi-level rules. EITHER [3, 4] was a more comprehensive revision
system for propositional rule bases that combined deductive, abductive, and inductive reasoning.
It used logical abduction to identify “holes” in a theory and used inductive rule learning
methods to repair them. NEITHER [5, 6] was a followup to EITHER that focused on revising KBs
containing “soft matching” M-of-N rules, which are satisfied as long as at least M of its N
antecedents are true. Other systems that refined propositional theories are DUCTOR [7] and the
work of Feldman et al. [8].
      </p>
      <p>A more challenging problem is revising first-order Horn-clause logical theories that include
relations, variables, and quantifiers. Work in this area was tightly connected to early work in
Inductive Logic Programming (ILP) [9]. MIS (Model Inference System) [10] was an early system
that tried to debug Prolog programs by interactively querying a human oracle. FOCL (First
Order Combined Learner) and its derivatives [11, 12] used a first-order theory to bias inductive
learning, but required user interaction to determine where to actually make theory revisions.
FORTE (First Order Revision of Theories from Examples) [13, 14] was a fully automated system
for revising relational KBs and was also used to automatically debug simple Prolog programs
developed by students learning logic programming. Other ILP systems that incorporated or
revised background knowledge are MLSMART [15], GOLEM [16], GRENDEL [17], and Rx [18].</p>
    </sec>
    <sec id="sec-3">
      <title>3. Probabilistic Knowledge Refinement</title>
      <p>Logical domain theories in AI have long been criticized for their inability to handle
uncertainty in reasoning, which is critical in most real-world applications. Adding certainty factors
to rules was an early approach to dealing with uncertainty in knowledge-based systems [19].
RAPTURE (Revising Approximate Probabilistic Theories Using Repositories of Examples) [20]
was a theory refinement system that was designed to revise certainty-factor rule bases. It
adapted backpropagation methods designed for neural-networks [21] to automatically revise
the certainty factor parameters through gradient descent. It also uses machine learning
methods adapted from decision-tree learning [22] to add features and revise the structure of the
rule base. Fu [23] also used backpropagation to revise certainty factors, but his approach was
unable to revise the rule-base structure.</p>
      <p>Ad hoc methods like certainty factors were criticized for not adhering to the well-founded
principles of probability theory and Bayesian reasoning. Consequently, techniques based more
ifrmly in probability theory, such as Bayesian networks [24], came to dominate
knowledgebased systems that supported uncertain reasoning. BANNER [25, 26] was a knowledge
reifnement system designed to revise manually-engineered Bayesian networks to fit empirical
data. Like RAPTURE, it uses a variant of backpropagation to adjust the conditional
probability parameters of the Bayes-net to fit labeled training data for a classification task. Then, as
needed, it alters the structure of the network using learning techniques to add new dependency
edges as well as new hidden variables. It focused on networks that used noisy-or and noisy-and
nodes that are probabilistic variants of these logical operators. This allowed it to map an initial
purely-logical theory to a Bayes-net and then refine it to fit empirical data. There was also
other work on revising Bayes nets [27, 28], but it was unable to add new hidden variables.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Knowledge-Based Neural Networks</title>
      <p>Starting in the late 1980’s neural networks had a rebirth after their near demise in the 1960’s,
due to the ability to train networks with ’hidden units’ [21] lying between the input and output
units. Towell and Shavlik [29] recognized the analogy between the dependency graph of a
rule set (i.e., a graph where the outputs from some rules serve as the inputs to others) and
a neural network. Their KBANN (Knowledge-Based Artificial Neural Networks) algorithm
mapped propositional rule sets into neural networks, setting weights so that initially the neural
network produced outputs near 1 when the rule set returned true and near 0 when the rule set
returned false. Figure 2 illustrates the correspondences. An early test on a gene-finding testbed
lead to a halving of the error rate [30].</p>
      <p>A disjunctive rule set representing some domain theory is on the left, drawn using the
common AND/OR notation. On the right is a corresponding neural network. There are a few aspects of
this figure worth noting.</p>
      <p>1. Not all the facts about the domain at hand may be referenced by the rule set (these are the open
red circles on the bottom), but an important role for them might be discovered during training.
2. Some rule preconditions might be missing, as illustrated by the dashed lines in the neural
network; initially these links are given weights near zero, but backpropagation might increase them
if doing so helps reduce error. Similarly, some rule antecedents might be pushed toward zero by
backpropagation, essentially removing them (backpropagation also converts the Boolean algebra
of rule sets into weighted sums that are input to the non-linear sigmoid function).
3. The rule set might be missing some rules, illustrated by the leftmost (purple) hidden unit in the
ifgure, so it can be beneficial to include some initially zero-weighted hidden units [31, 32].
4. A complex rule set can lead to a deep neural network, deeper than the traditional one-hidden-layer
network of the mid-1980’s and early 1990’s. A KBANN-followup paper by Towell and Shavlik
[33] specifically addressed the use of symbolic knowledge to deal with the challenges of training
deep neural networks.</p>
      <p>Because neural networks learn in an incremental manner (i.e., one batch of examples at a time), it is
possible to consider adding more domain rules in the midst of a long training run [34] (e.g., in the middle
of Figure 2’s X axis). For example, observing the mistakes made by a robotic reinforcement learner might
cause a human teacher to devise some new rules. (This ability to accept rules after learning has begun
means one should not think of theory refinement as only using prior knowledge.)</p>
      <p>Since backpropagation changes the simple logical semantics of propositional rule sets into less
intuitive weighted sums, some early researchers [35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47] investigated
the task of rule extraction where one converts a trained neural network into a more human-readable
representations, such as set of rules or a small decision tree. These approaches are generally also
applicable to neural networks trained without the use of domain knowledge, and some even can be applied to
alternate complex learned representations, such as a forest of decision trees (e.g., [45]). The task of rule
extraction closely relates to the current extensive interest in explainable AI, especially in the context of
deep neural networks.</p>
      <p>Additional early work on refining and/or exploiting symbolic knowledge by neural networks includes
Gallant [35], Fu [48], Shavlik and Towell [49], Berenji [50], Frasconi et al. [51], Omlin and Giles [52],
Roscheisen et al. [53], Mahoney and Mooney [20], Tresp et al. [54], and Thrun and Mitchell [55] (these
citations are sorted by publication year). See Shavlik [56] for a review written in 1992.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Application Areas</title>
      <p>Theory/knowledge refinement has been applied to a variety of application areas demonstrating that
combining human-engineered knowledge and machine learning could develop more accurate intelligent
systems than using either approach alone.</p>
      <p>Some classic domains in AI and machine learning such as soybean disease diagnosis [57] and
human infectious disease diagnosis as performed by the famous MYCIN expert-system [58] were studied.
Both EITHER [4] and RAPTURE [20] demonstrated improved performance on soybean diagnosis, and
RAPTURE also demonstrated improved performance on MYCIN data.</p>
      <p>Another interesting application of logical theory refinement involved improving student modelling
for intelligent tutoring systems using a system called ASSERT [59, 60].1 Using a KB encoding correct
knowledge needed to perform a task and examples of a student’s behavior for this task, ASSERT modeled
student errors by generating refinements to the correct knowledge base suficient to account for the
student’s behavior. ASSERT was evaluated using 100 students tested on a classification task covering
concepts from an introductory course on C++ programming. Students who received feedback based on
student models generated by ASSERT performed significantly better on a post test than students who
received just basic instruction.</p>
      <p>Applications of knowledge-based neural networks include gene finding [30, 61], protein folding [62],
language learning [52, 63], robot training: [34], non-linear control [50, 64], manufacturing [53],
computer vision [65], and information extraction [66].</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>This paper has reviewed work from the 1990’s on combining knowledge-engineering and machine
learning to revise KBs to fit empirical data. This earlier work used a variety of knowledge representation
formalisms as well as a range of logical, probabilistic, and neural-network learning methods. It was
also evaluated on a range of applications, experimentally demonstrating its ability to achieve improved
performance by efectively combining KE and ML. We believe many of the ideas embodied in this early
work could be updated to utilize the latest developments in KE and ML, and hope they provide
inspiration and guidance in continuing work on combining KE and ML to improve the capabilities and
performance of AI systems.
[3] D. Ourston, R. Mooney, Changing the rules: A comprehensive approach to theory refinement, in:
Proceedings of the Eighth National Conference on Artificial Intelligence (AAAI-90), Boston, MA,
1990, pp. 815–820.
[4] D. Ourston, R. J. Mooney, Theory refinement combining analytical and empirical methods,
Artiifcial Intelligence 66 (1994) 311–344.
[5] P. T. Bafes, R. J. Mooney, Symbolic revision of theories with M-of-N rules, in: Proceedings of the
Thirteenth International Joint Conference on Artificial Intelligence (IJCAI-93), Chambery, France,
1993, pp. 1135–1140.
[6] P. T. Bafes, R. J. Mooney, Extending theory refinement to M-of-N rules, Informatica 17 (1993)
387–397.
[7] T. Cain, The DUCTOR: A theory revision system for propositional domains, in: Proceedings of
the Eighth International Workshop on Machine Learning, Evanston, IL, 1991, pp. 485–489.
[8] R. Feldman, A. M. Segre, M. Koppel, Incremental refinement of approximate domain theories, in:
Proceedings of the Eighth International Workshop on Machine Learning, Evanston, IL, 1991, pp.
500–504.
[9] N. Lavrac˘, S. Dz˘eroski, Inductive Logic Programming: Techniques and Applications, Ellis
Horwood, 1994.
[10] E. Y. Shapiro, Algorithmic Program Debugging, MIT Press, Cambridge, MA, 1983.
[11] M. J. Pazzani, C. Brunk, Detecting and correcting errors in rule-based expert systems: An
integration of empirical and explanation-based learning, in: Proceedings of the 5th Knowledge
Acquisition for Knowledge-Based Systems Workshop, Banf, Canada, 1990.
[12] M. J. Pazzani, D. F. Kibler, The utility of background knowledge in inductive learning, Machine</p>
      <p>Learning 9 (1992) 57–94.
[13] B. L. Richards, R. J. Mooney, First-order theory revision, in: Proceedings of the Eighth International</p>
      <p>Workshop on Machine Learning, Evanston, IL, 1991, pp. 447–451.
[14] B. L. Richards, R. J. Mooney, Automated refinement of first-order Horn-clause domain theories,</p>
      <p>Machine Learning 19 (1995) 95–131.
[15] F. Bergadano, A. Giordana, A knowledge intensive approach to concept induction, in: Proceedings
of the Fifth International Conference on Machine Learning (ICML-88), Ann Arbor, MI, 1988, pp.
305–317.
[16] S. Muggleton, C. Feng, Eficient induction of logic programs, in: Proceedings of the First
Conference on Algorithmic Learning Theory, Ohmsha, Tokyo, Japan, 1990.
[17] W. W. Cohen, Compiling prior knowledge into an explicit bias, in: Proceedings of the Ninth</p>
      <p>International Conference on Machine Learning (ICML-92), Aberdeen, Scotland, 1992, pp. 102–110.
[18] S. Tangkitvanich, M. Shimura, Refining a relational theory with multiple faults in the concept
and subconcepts, in: Proceedings of the Ninth International Conference on Machine Learning
(ICML-92), Aberdeen, Scotland, 1992, pp. 436–444.
[19] E. H. Shortlife, B. G. Buchanan, A model of inexact reasoning in medicine, Mathematical
Biosciences 23 (1975) 351–379.
[20] J. J. Mahoney, R. J. Mooney, Combining connectionist and symbolic learning to refine
certaintyfactor rule-bases, Connection Science 5 (1993) 339–364.
[21] D. E. Rumelhart, G. E. Hinton, R. J. Williams, Learning internal representations by error
propagation, in: D. E. Rumelhart, J. L. McClelland (Eds.), Parallel Distributed Processing, Vol. I, MIT Press,
Cambridge, MA, 1986, pp. 318–362.
[22] J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo, CA, 1993.
[23] L.-M. Fu, Integration of neural heuristics into knowledge-based inference, Connection Science 1
(1989) 325–339.
[24] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan</p>
      <p>Kaufmann, San Mateo, CA, 1988.
[25] S. Ramachandran, R. J. Mooney, Revising Bayesian networks parameters using backpropagation,
in: International Conference on Neural Networks, Washington D.C., USA, 1996, pp. 82–87.
[26] S. Ramachandran, R. J. Mooney, Theory refinement for Bayesian networks with hidden variables,
in: Proceedings of the Fifteenth International Conference on Machine Learning (ICML-98),
Madison, WI, 1998, pp. 454–462.
[27] W. Buntine, Theory refinement on Bayesian networks, in: Proceedings of the Seventh Conference
on Uncertainty in Artificial Intelligence (UAI-91), 1991.
[28] W. Lam, F. Bacchus, Using causal information and local measure to learn Bayesian networks, in:
Proceedings of the Ninth Conference on Uncertainty in Artificial Intelligence (UAI-93), 1993, pp.
243–250.
[29] G. Towell, J. Shavlik, Knowledge-based artificial neural networks, Artificial Intelligence 70 (1994)
119–165.
[30] G. Towell, J. Shavlik, M. Noordewier, Refinement of approximate domain theories by
knowledgebased neural networks, in: Proceedings of the Eighth National Conference on Artificial Intelligence
(AAAI-90), Boston, MA, 1990, pp. 861–866.
[31] D. Opitz, J. Shavlik, Heuristically expanding knowledge-based neural networks, in: Proceedings
of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI-93), Chambery,
France, 1993, pp. 1360–1365.
[32] D. Opitz, J. Shavlik, Dynamically adding symbolically meaningful nodes to knowledge-based
neural networks, Knowledge-Based Systems 8 (1995) 301–311.
[33] G. Towell, J. Shavlik, Using symbolic learning to improve knowledge-based neural networks, in:
Proceedings of the Tenth National Conference on Artificial Intelligence (AAAI-92), San Jose, CA,
1992, pp. 177–182.
[34] R. Maclin, J. Shavlik, Creating advice-taking reinforcement learners, Machine Learning 22 (1996)
251–281.
[35] S. I. Gallant, Connectionist expert systems, Commun. ACM 31 (1988) 152–169.
[36] L. Fu, Rule learning by searching on adapted nets, in: Proceedings of the Ninth National
Conference on Artificial Intelligence (AAAI-91), Anaheim, CA, 1991, pp. 590–595.
[37] Y. Hayashi, A neural expert system with automated extraction of fuzzy if-then rules, in: Advances
in Neural Information Processing Systems 3, Morgan Kaufmann, San Mateo, CA, 1991, pp. 578–584.
[38] C. McMillan, M. C. Mozer, P. Smolensky, Rule induction through integrated symbolic and
subsymbolic processing, in: Advances in Neural Information Processing Systems 4, Morgan Kaufmann,
San Mateo, CA, 1992, pp. 969–976.
[39] I. Sethi, J. Yoo, C. Brickman, Extraction of diagnostic rules using neural networks, in: Proceedings
of the Sixth Annual 1993 IEEE Symposium Computer-Based Medical Systems, 1993, pp. 217–222.
[40] S. Thrun, Extracting Provably Correct Rules from Artificial Neural Networks, Technical Report,</p>
      <p>University of Bonn, 1993.
[41] G. Towell, J. Shavlik, The extraction of refined rules from knowledge-based neural networks,</p>
      <p>Machine Learning 13 (1993) 71–101.
[42] J. Alexander, M. Mozer, Template-based algorithms for connectionist rule extraction, in: Advances
in Neural Information Processing Systems 7, 1994.
[43] R. Setiono, H. Liu, Understanding neural networks via rule extraction, in: Proceedings of the</p>
      <p>Fourteenth International Joint Conference on Artificial Intelligence (IJCAI-95), 1995.
[44] C. Omlin, C. Giles, Rule revision with recurrent neural networks, IEEE Transactions on Knowledge
and Data Engineering 8 (1996) 183 – 188.
[45] M. Craven, J. Shavlik, Extracting tree-structured representations of trained networks, in: Advances
in Neural Information Processing Systems 8, MIT Press, Denver, CO, 1996, pp. 24–30.
[46] R. Andrews, J. Diederich, A. B. Tickle, Survey and critique of techniques for extracting rules from
trained artificial neural networks, Knowledge-Based Systems 8 (1995) 373–389.
[47] M. Craven, J. Shavlik, Rule Extraction: Where Do We Go from Here?, Technical Report Machine
Learning Research Group Working Paper 99-1, Department of Computer Sciences, University of
Wisconsin, 1999.
[48] L. Fu, Integration of neural heuristics into knowledge-based inference, Connection Science 1
(1989) 325–340.
[49] J. Shavlik, G. Towell, Combining explanation-based and neural learning: An algorithm and
empirical results, Connection Science 1 (1989) 233–255.
[50] H. Berenji, Refinement of approximate reasoning-based controllers by reinforcement learning,
in: Proceedings of the Eighth International Workshop on Machine Learning, Morgan Kaufmann,
Evanston, IL, 1991, pp. 475–479.
[51] P. Frasconi, M. Gori, M. Maggini, G. Soda, An unified approach for integrating explicit knowledge
and learning by example in recurrent networks, in: International Joint Conference on Neural
Networks (IJCNN-91), 1991, pp. 811–816.
[52] C. Omlin, C. Giles, Training second-order recurrent neural networks using hints, in: Proceedings
of the Ninth International Conference on Machine Learning (ICML-92), Aberdeen, Scotland, 1992,
pp. 361–366.
[53] M. Roscheisen, R. Hofmann, V. Tresp, Neural control for rolling mills: Incorporating domain
theories to overcome data deficiency, in: Advances in Neural Information Processing Systems 4,
volume 4, Morgan Kaufmann, San Mateo, CA, 1992, pp. 659–666.
[54] V. Tresp, J. Hollatz, S. Ahmad, Network structuring and training using rule-based knowledge, in:</p>
      <p>Advances in Neural Information Processing Systems 5, Morgan Kaufmann, 1992, pp. 871–878.
[55] S. Thrun, T. Mitchell, Integrating inductive neural network learning and explanation-based
learning, in: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence
(IJCAI-93), Chambery, France, 1993, pp. 930–936.
[56] J. Shavlik, A framework for combining symbolic and neural learning, Machine Learning 14 (1994)
321–331.
[57] R. S. Michalski, R. L. Chilausky, Learning by being told and learning from examples: An
experimental comparison of the two methods of knowledge acquisition in the context of developing an
expert system for soybean disease diagnosis, Journal of Policy Analysis and Information Systems
4 (1980) 126–161.
[58] B. G. Buchanan, E. Shortlife, Rule-Based Expert Systems:The MYCIN Experiments of the Stanford</p>
      <p>Heuristic Programming Project, Addison-Wesley Publishing Co., Reading, MA, 1984.
[59] P. T. Bafes, R. J. Mooney, A novel application of theory refinement to student modeling, in:
Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI-96), Portland,
OR, 1996, pp. 403–408.
[60] P. T. Bafes, R. J. Mooney, Refinement-based student modeling and automated bug library
construction, Journal of Artificial Intelligence in Education 7 (1996) 75–116.
[61] M. Noordewier, G. Towell, J. Shavlik, Training knowledge-based neural networks to recognize
genes in DNA sequences, in: R. Lippmann, J. Moody, D. Touretzky (Eds.), Advances in Neural
Information Processing Systems 3, volume 3, Morgan Kaufmann, Denver, CO, 1991, pp. 530–536.
[62] R. Maclin, J. Shavlik, Using knowledge-based neural networks to improve algorithms: Refining
the Chou-Fasman algorithm for protein folding, Machine Learning 11 (1993) 195–215.
[63] C. Giles, C. Miller, D. Chen, H. Chen, G. Sun, Y. Lee, Learning and extracting finite state automata
with second-order recurrent neural networks, Neural Computation 4 (1992) 393–405.
[64] G. Scott, J. Shavlik, W. Ray, Refining PID controllers using neural networks, in: J. Moody, S.
Hanson, R. Lippmann (Eds.), Advances in Neural Information Processing Systems 5, volume 4, Morgan
Kaufmann, Denver, CO, 1992, pp. 555–562.
[65] C. Wu, Knowledge-based artificial neural network and the application of it in understanding
remotely sensed images, in: X. Shen, J. Liu (Eds.), Neural Network and Distributed Processing,
volume 4555, International Society for Optics and Photonics, SPIE, 2001, pp. 160 – 164.
[66] T. Eliassi-Rad, J. Shavlik, A theory-refinement approach to information extraction, in: Proceedings
of 18th International Conference on Machine Learning (ICML-2001), Williamstown, MA, 2001.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ginsberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Weiss</surname>
          </string-name>
          , P. Politakis,
          <article-title>Automatic knowledge based refinement for classification systems</article-title>
          ,
          <source>Artificial Intelligence</source>
          <volume>35</volume>
          (
          <year>1988</year>
          )
          <fpage>197</fpage>
          -
          <lpage>226</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ginsberg</surname>
          </string-name>
          ,
          <article-title>Theory reduction, theory revision, and retranslation</article-title>
          ,
          <source>in: Proceedings of the Eighth National Conference on Artificial Intelligence (AAAI-90)</source>
          , Detroit, MI,
          <year>1990</year>
          , pp.
          <fpage>777</fpage>
          -
          <lpage>782</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>