Steps for downloading the full dataset from CiteSeerX:
- Download and extract the "Demo" from http://www.oclc.org/research/software/oai/harvester.htm
- Go to the directory of the extracted files, type the following command to download the full dataset of CiteSeerX to the file "citeseerx_alldata.xml"
java -classpath .;oaiharvester.jar;xerces.jar org.acme.oai.OAIReaderRawDump http://citeseerx.ist.psu.edu/oai2 -o citeseerx_alldata.xml
1 comments:
Ive tried this method but I can only ever download a maximum file size of around 30MB no matter what the parameters
Post a Comment