只是简单给URLFilters(org.apache.nutch.net.URLFilters)加了一个 main(String[]) 方法 :*)
public static void main(String args[]) throws IOException, MalformedPatternException, Exception {
BufferedReader in=new BufferedReader(new InputStreamReader(System.in));
String line;
while((line=in.readLine())!=null) {
String out=URLFilters.filter(line);
if(out!=null) {
System.out.print("+");
System.out.println(out);
} else {
System.out.print("-");
System.out.println(line);
}
}
}
这样测试url filter rule 就方便多了:)
$ ./bin/nutch org.apache.nutch.net.URLFilters
......
http://www.lhelper.org/blog/
using fitler 0:org.apache.nutch.net.RegexURLFilter
+http://www.lhelper.org/blog/
http://www.lhelper.org/blog/?q=lhelper
using fitler 0:org.apache.nutch.net.RegexURLFilter
-http://www.lhelper.org/blog/?q=lhelper