Scout: a Web extraction tool
Student: Anthony L. Borchers 11/1998, now at Lexmark International
Purpose: Apply data extraction techniques to World Wide Web documents
Method: A general-purpose WWW robot engine with an extension mechanism for attaching data-extraction procedures at runtime. These procedures may then apply domain- or format-specific extraction methods.
What the student learned
Managing a large software project over a long period of time.
Multithreaded programming in Java, including synchronization methods. (This part took considerable cleverness.)
Details of the HyperText Transfer Protocol (HTTP).
Technical writing skills in preparing the write up and packaging
the resulting tools