Information Technology Reference
In-Depth Information
• Interpreted source code—for example, ASP, JSP, JavaScript, and
Ruby
• Build scripts—for example, make and Ant build files
• Data and configuration files—for example, ASCII, XML, XSD,
and DTD
Michael Toomim, Andrew Begel, and Susan L. Graham 8 noted that
“recent studies estimate that the Linux kernel (as of 2002) is 15%-
25% duplicated,” 9 and “the Sun Java JDK is 21%-29% duplicated.” 10
Code duplication is a real-life problem, even for popular software
packages used throughout the industry. 11
Duplicated code causes these problems:
• Increased maintenance costs due to discovering, reporting, ana-
lyzing, and fixing bugs multiple times
• Uncertainty about the existence of other bugs (duplicate code
that hasn't been found yet)
• Increased testing costs for the additional code written
Using PMD-CPD
Several tools are available for finding duplicate code. PMD offers a
Copy/Paste Detector (CPD) for C/C++, Java, PHP, and Ruby. The tool
works fairly well, is simple to set up and use, and can generate output
to XML, CSV, or text (ASCII). Listing 7-6 demonstrates using the
CPD task with Ant.
8. See “Managing Duplicated Code with Linked Editing,” at http://harmo-
nia.cs.berkeley.edu/papers/toomim-linked-editing.pdf.
9. As referenced in the article “Analyzing cloning evolution in the Linux kernel,”
by G. Antoniol, M. D. Penta, E. Merlo, and U. Villano, in the Journal of Informa-
tion and Software Technology 44(13):755-765, 2002.
10. As referenced in “CCFinder: A multilinguistic token-based code clone detec-
tion system for large scale source code,” by T. Kamiya, S. Kusumoto, and K. Inoue,
in IEEE Transactions on Software Engineering , 28(6):654-670, 2002.
11. See “Managing Duplicated Code with Linked Editing,” at http://harmo-
nia.cs.berkeley.edu/papers/toomim-linked-editing.pdf.
Search WWH ::




Custom Search