SE (M) 2018-2019 Assessed Exercise 1 (10% of final grade)
This assessment will involve writing JUnit test cases to complete black-box testing of a Java application. The application is a simple web crawler, where you will be asked to debug the output of the application. A web crawler is a piece of software that looks at webpages, saves or prints the output of that page and follows outbound links from that page to visit more webpages. A crawler that works without errors will successfully store all of the text and html content correctly and visit all outgoing links on each webpage. Web crawler application For this assessment, you will be provided with an Eclipse project that includes a Java application which implements a web crawler. The functionality of the web crawler is demonstrated in two classes MyCrawlerController and MyCrawler. If you the MyCrawlerController class, you will see the output of the crawler printed in the console.
Web content to crawl For this assessment, the application will be a subset of Wikipedia available here: http://www.dcs.gla.ac.uk/~bjorn/sem20172018/ae2public/Machine_learning.html We recommend that you visit the webpages to get an impression of the page you are trying to crawl. Documentation All of the API documentation for the application is available in the doc folder. Aside from the general understanding of the programme you will need to pay particular attention to the Crawler and Parser packages (and additional support packages/classes for the actual test methods).
Testing task This application has been injected with at least 4 defects that will be observable as being different from the expected behaviour documented in the provided Javadoc. All errors are inferable from the output of theMyCrawlerController class by comparing with content of the actual online web pages. It is your task to discover 4 defects and demonstrate the presence of these defects using JUnit test cases. Note: The errors can all be determined from inspecting the console output, the web pages to be crawled and the example code provided. They are not related to trivial extra/missing white spaces or line breaks. Submission Your submission will be a single Java class, entitled MyTest. You must submit the source file of this class (along with any custom test data you may have used as a zip file). You will find a template for this class in the Eclipse project.
In the comments above each test method, you must in Javadoc format concisely describe:
1) what aspect of the code you are covering (incl. class name) and why, 2) what inputs/outputs you are testing and why, 3) what error you think this demonstrates.
For each point you can provide maximum 50 words. Notice that you do not need to describe how you would fix the defects! A complete assessment will include 4 well-documented JUnit test cases that can be executed without errors and demonstrates the presence of 4 different defects.
Submission will be via Moodle on 18/2 2018 at 16:30..
Additional information A bug-free and production version of the web crawler is available here https://github.com/yasserg/crawler4j. It is not necessary to consult the bug-free version to solve the exercise, but it may help in locating the potential classes to test. Note that some access changes have been made to some classes in the bug-prone version to make it easy to test.
Marking scheme Points will be given for finding errors and suitably documenting your testing process in the comments above each test method (as Javadoc comments). If you are unable to discover all errors, you can still get partial points for a well-documented of the bug and a potential test strategy.
Section Points (max) Test Method 1 Error Found Documentation 15 10 Test Method 2 Error Found Documentation 15 10 Test Method 3 Error Found Documentation 15 10 Test Method 4 Error Found Documentation 15 10 Total 100