Software library construction from an IR perspective
- 1 September 1991
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGIR Forum
- Vol. 25 (2) , 8-18
- https://doi.org/10.1145/122665.122667
Abstract
The two basic requirements for achieving software reuse are: (1) to provide a sufficient number of components over a spectrum of domains that can be reused as is ( black-box reuse) or easily adapted ( white-box reuse), and (2) to organize components such that code close to the users' needs is easy to locate. Many attempts have been made at addressing the first issue, ( e.g. UNIX) however, as far as the second requirement is concerned very few library systems are available or attractive enough to make actual reuse faster than rewriting from scratch.With the increasing size of natural-language documentation in modern software component collections, there is a growing interest in applying IR techniques to the construction of software libraries. For instance recent tools such as INFoEXPLORER for the IBM RS/6000 series, or ANSWERBOOK for Sun Sparc workstations, provide standard IR techniques for searching on-line documentation. However, due to the nature of software documentation and of reuse requirements, some specific IR techniques can be devised to significantly enhance retrieval effectiveness.In this paper, we identify the necessary requirements to be met by the software collection in order to apply an IR approach, and we describe the specificity of reuse as compared to other applications. As an example, we describe GuRu, an information storage and retrieval system for reuse. This system is only briefly described as it has already presented elsewhere, and we rather concentrate on showing how GuRu satisfies the specific needs of reuse. GURU is currently used at the IBM T.J. Watson Research Center by a growing pool of users. We also provide an experimental test collection with relevance judgments for the AIx 3 command set to be used as a starting ground for evaluating retrieval effectiveness in reuse applications. Finally, we compare GURU's indexing scheme to two other schemes using this test collection.Keywords
This publication has 13 references indexed in Scilit:
- Integrating information retrieval and domain specific approaches for browsing and retrieval in object-oriented class librariesPublished by Association for Computing Machinery (ACM) ,1991
- An information retrieval approach for automatically constructing software librariesIEEE Transactions on Software Engineering, 1991
- On the application of syntactic methodologies in automatic text analysisPublished by Association for Computing Machinery (ACM) ,1989
- The effectiveness of a nonsyntactic approach to automatic phrase indexing for document retrievalJournal of the American Society for Information Science, 1989
- NLH/EPublished by Association for Computing Machinery (ACM) ,1989
- A knowledge-base environment for the development of software parts composition systemsPublished by Association for Computing Machinery (ACM) ,1989
- An information retrieval system for software componentsACM SIGIR Forum, 1988
- AUTOMATIC SEARCH TERM VARIANT GENERATIONJournal of Documentation, 1984
- FASIT: A fully automatic syntactically based indexing systemJournal of the American Society for Information Science, 1983
- Machine-aided indexing of technical literatureInformation Storage and Retrieval, 1973