NaturalJava: A Pioneering Natural Language Interface for Java Programming

Developed by David Price and colleagues in 2000, NaturalJava demonstrated the revolutionary possibility of using natural language processing (NLP) to program computers. Through translated speech or typed sentences, NaturalJava allowed users to author Java code by describing desired constructs in plain English rather than Java syntax. While limited in scope, it pioneered a paradigm shift toward more accessible programming.

At its core, NaturalJava provided a “syntactic sugar” layer on top of Java’s syntax—users could articulate statements like “Create a for loop that iterates 10 times” rather than formally coding that loop. The system architecture contained three main components: the Sundance natural language parser, information extraction techniques to identify Java constructs within sentences, and a code generator to produce Java syntax trees. For example, if a user said “Make an integer called x that is set to 5,” Sundance would parse this to recognize the request for an integer variable initialization, then the code generator would output the corresponding “int x = 5;” in Java.

This bridging of the gap between English and Java afforded end users a simpler programming experience closer to human language. Rather than needing to directly input standard Java code, users could describe their intent for the program’s logic at a higher level while NaturalJava handled the translation under the hood. As creator David Price noted, “Programmers express application concepts to NaturalJava using vocabulary drawn primarily from the application domain rather than the programming domain.”

However, NaturalJava was limited chiefly to authoring basic code rather navigating or editing existing programs. It also supported only a small subset of Java’s expansive functionality. Partly due to these constraints, there has been little ongoing development on NaturalJava and its capabilities have become outdated in recent years. Still, it laid vital groundwork for the long-standing ambition of accessible programming via intuitive interfaces.

Modern systems aim to capitalize on advances in NLP to parse more complex language input. But the core principles of NaturalJava live on in continuing explorations of natural language for cleaner, more understandable software development. Just as icons and graphical interfaces revolutionized personal computing, linguistic programming promises to minimize syntactic formalities so users can simply request the logic they need. If innovations someday allow users to describe problems in plain terms rather than forcing computer-speak, we will owe thanks to pioneering platforms like NaturalJava for proving this was possible.

See also:

NLPA (Natural Language Program Analysis) | SLP (Spoken Language Programming)VoiceCode

  • Bajwa, I. S., Lee, M., & Bordbar, B. (2012). Semantic Analysis of Software Constraints.
  • Begel, A. (2004). Spoken language support for software development. In Visual Languages and Human Centric Computing, 2004.(VL/HCC 2004). IEEE Symposium on (pp. 301-308). IEEE.
  • Begel, A., & Graham, S. L. (2005). Spoken programs. In Visual Languages and Human-Centric Computing, 2005 IEEE Symposium on (pp. 99-106). IEEE.
  • Bui, C. K. (2013). An evolutional domain oriented approach to problem solving based on web service composition (Doctoral dissertation, University of Iowa).
  • Chong, S., & Pucella, R. (2004). A framework for creating natural language user interfaces for action-based applications. arXiv preprint cs/0412065.
  • Cozzie, A., & King, S. (2012). Macho: Writing programs with natural language and examples. University of Illinois at Urbana-Champaign.
  • Désilets, A., Fox, D. C., & Norton, S. (2006, April). VoiceCode: an innovative speech interface for programming-by-voice. In CHI’06 Extended Abstracts on Human Factors in Computing Systems (pp. 323-328). ACM.
  • Gordon, B. M. (2013). Improving spoken programming through language design and the incorporation of dynamic context (Doctoral dissertation, University of New Mexico).
  • Gordon, B. M., & Luger, G. F. (2012, March). English for spoken programming. In Soft Computing and Intelligent Systems (SCIS) and 13th International Symposium on Advanced Intelligent Systems (ISIS), 2012 Joint 6th International Conference on (pp. 533-538). IEEE.
  • Kaneko, N., & Onisawa, T. (2004, October). End-user programming by linguistic expression employing interaction and paraphrasing. In Proceedings of SCIS & ISIS (pp. 1508-1513).
  • Knöll, R., & Mezini, M. (2006, October). Pegasus: first steps toward a naturalistic programming language. In Companion to the 21st ACM SIGPLAN symposium on Object-oriented programming systems, languages, and applications (pp. 694-696). ACM.
  • Knöll, R., Gasiunas, V., & Mezini, M. (2011, October). Naturalistic types. In Proceedings of the 10th ACM international conference on Generative programming and component engineering (pp. 51-60). ACM.
  • Landhäußer, M., Hey, T., & Tichy, W. F. (2014, May). Deriving time lines from texts. In Proceedings of the 3rd International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (pp. 50-53). ACM.
  • Laukaitis, A., Vasilecas, O., & Gediminas, V. (2007). Natural language based heavy personal assistant architecture for information retrieval and presentation. In Human Interface and the Management of Information. Interacting in Information Environments (pp. 161-170). Springer, Berlin, Heidelberg.
  • Le, V., Gulwani, S., & Su, Z. (2013, June). Smartsynth: synthesizing smartphone automation scripts from natural language. In Proceedings of the 11th annual international conference on Mobile systems, applications, and services (pp. 193-206). ACM.
  • Linhalis, F., & Moreira, D. D. A. (2006). Ontology-based application server to the execution of imperative natural language requests. In Flexible Query Answering Systems (pp. 228-237). Springer, Berlin, Heidelberg.
  • Linhalis, F., Fortes, R. P. D. M., Pereira, D. D. S., & Medeiros, C. B. (2010). OntoMap: an ontology-based architecture to perform the semantic mapping between an interlingua and software components. Knowledge and Information Systems, 22(3), 319-345.
  • Little, G., & Miller, R. C. (2006, April). Translating keyword commands into executable code. In Proceedings of the 19th annual ACM symposium on User interface software and technology (pp. 265-274). ACM.
  • Masuoka, C. (n.d.) Java Programming Using Voice Input.
  • Pappu, A., Tenneti, T., & Tiwary, U. (n.d.). COFFEE: COmpiler Framework For Executables in English.
  • Patel, R., & Patel, M. (n.d.). Hands free JAVA (Through Speech Recognition).
  • Price, D., Riloff, E., Zachary, J., & Harvey, B. (2000, January). Naturaljava: a natural language interface for programming in java. In Proceedings of the 5th international conference on Intelligent user interfaces (pp. 207-211). ACM.
  • Su, K. (2007). Continuous execution: improving user feedback in the development cycle (Doctoral dissertation, Massachusetts Institute of Technology).
  • Sugimoto, T., Ito, N., & Iwashita, S. (2007). A proposal of a language-based context-sensitive programming system. Journal of Advanced Computational Intelligence and Intelligent Informatics, 11(2), 140-150.
  • Thummalapenta, S., Sinha, S., Singhania, N., & Chandra, S. (2012, June). Automating test automation. In 2012 34th International Conference on Software Engineering (ICSE) (pp. 881-891). IEEE.
  • Vadas, D., & Curran, J. R. (2005, December). Programming with unrestricted natural language. In Proceedings of the Australasian Language Technology Workshop (pp. 191-199).
  • Vijyapurpu, C. S. (2012). Java API-Aware Code Generation Engine: A Prototype (Doctoral dissertation, Utah State University).