热度 8|
前文介绍了dig和whois,通过它们我们可以查找一个域名以及跟踪域名解析问题;接下来我们介绍一下如何使用命令行的工具通过HTTP/FTP方式来访问这些域名的资源。
wget是一个非常流行的Linux上的单进程的支持断点续传的一个HTTP/FTP下载客户端。它的windows版本可以从这里找到:
本文发表时的最新版本是1.11.4:
http://users.ugent.be/~bpuype/cgi-bin/fetch.pl?dl=wget/wget.exe
使用wget可以镜像一个WEB网站(自动分析其中的URL并下载)。也支持通过通配符来批量下载FTP文件。
通过wget来下载最新的版本:
d:\>wget http://users.ugent.be/~bpuype/wget/wget.exe --2009-10-29 11:04:40-- http://users.ugent.be/~bpuype/wget/wget.exe Resolving users.ugent.be... 157.193.40.15 Connecting to users.ugent.be|157.193.40.15|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 401408 (392K) [application/x-msdos-program] Saving to: `wget.exe' 100%[================================================>] 401,408 71.7K/s in 6.5s 2009-10-29 11:04:48 (60.7 KB/s) - `wget.exe' saved [401408/401408]
wget有很多选项:
d:\>wget --help GNU Wget 1.11.4, a non-interactive network retriever. Usage: wget [OPTION]... [URL]... Mandatory arguments to long options are mandatory for short options too. Startup: -V, --version display the version of Wget and exit. -h, --help print this help. -b, --background go to background after startup. -e, --execute=COMMAND execute a `.wgetrc'-style command. Logging and input file: -o, --output-file=FILE log messages to FILE. -a, --append-output=FILE append messages to FILE. -d, --debug print lots of debugging information. -q, --quiet quiet (no output). -v, --verbose be verbose (this is the default). -nv, --no-verbose turn off verboseness, without being quiet. -i, --input-file=FILE download URLs found in FILE. -F, --force-html treat input file as HTML. -B, --base=URL prepends URL to relative links in -F -i file. Download: -t, --tries=NUMBER set number of retries to NUMBER (0 unlimits). --retry-connrefused retry even if connection is refused. -O, --output-document=FILE write documents to FILE. -nc, --no-clobber skip downloads that would download to existing files. -c, --continue resume getting a partially-downloaded file. --progress=TYPE select progress gauge type. -N, --timestamping don't re-retrieve files unless newer than local. -S, --server-response print server response. --spider don't download anything. -T, --timeout=SECONDS set all timeout values to SECONDS. --dns-timeout=SECS set the DNS lookup timeout to SECS. --connect-timeout=SECS set the connect timeout to SECS. --read-timeout=SECS set the read timeout to SECS. -w, --wait=SECONDS wait SECONDS between retrievals. --waitretry=SECONDS wait 1..SECONDS between retries of a retrieval. --random-wait wait from 0...2*WAIT secs between retrievals. --no-proxy explicitly turn off proxy. -Q, --quota=NUMBER set retrieval quota to NUMBER. --bind-address=ADDRESS bind to ADDRESS (hostname or IP) on local host. --limit-rate=RATE limit download rate to RATE. --no-dns-cache disable caching DNS lookups. --restrict-file-names=OS restrict chars in file names to ones OS allows. --ignore-case ignore case when matching files/directories. --user=USER set both ftp and http user to USER. --password=PASS set both ftp and http password to PASS. Directories: -nd, --no-directories don't create directories. -x, --force-directories force creation of directories. -nH, --no-host-directories don't create host directories. --protocol-directories use protocol name in directories. -P, --directory-prefix=PREFIX save files to PREFIX/... --cut-dirs=NUMBER ignore NUMBER remote directory components. HTTP options: --http-user=USER set http user to USER. --http-password=PASS set http password to PASS. --no-cache disallow server-cached data. -E, --html-extension save HTML documents with `.html' extension. --ignore-length ignore `Content-Length' header field. --header=STRING insert STRING among the headers. --max-redirect maximum redirections allowed per page. --proxy-user=USER set USER as proxy username. --proxy-password=PASS set PASS as proxy password. --referer=URL include `Referer: URL' header in HTTP request. --save-headers save the HTTP headers to file. -U, --user-agent=AGENT identify as AGENT instead of Wget/VERSION. --no-http-keep-alive disable HTTP keep-alive (persistent connections). --no-cookies don't use cookies. --load-cookies=FILE load cookies from FILE before session. --save-cookies=FILE save cookies to FILE after session. --keep-session-cookies load and save session (non-permanent) cookies. --post-data=STRING use the POST method; send STRING as the data. --post-file=FILE use the POST method; send contents of FILE. --content-disposition honor the Content-Disposition header when choosing local file names (EXPERIMENTAL). --auth-no-challenge Send Basic HTTP authentication information without first waiting for the server's challenge. HTTPS (SSL/TLS) options: --secure-protocol=PR choose secure protocol, one of auto, SSLv2, SSLv3, and TLSv1. --no-check-certificate don't validate the server's certificate. --certificate=FILE client certificate file. --certificate-type=TYPE client certificate type, PEM or DER. --private-key=FILE private key file. --private-key-type=TYPE private key type, PEM or DER. --ca-certificate=FILE file with the bundle of CA's. --ca-directory=DIR directory where hash list of CA's is stored. --random-file=FILE file with random data for seeding the SSL PRNG. --egd-file=FILE file naming the EGD socket with random data. FTP options: --ftp-user=USER set ftp user to USER. --ftp-password=PASS set ftp password to PASS. --no-remove-listing don't remove `.listing' files. --no-glob turn off FTP file name globbing. --no-passive-ftp disable the "passive" transfer mode. --retr-symlinks when recursing, get linked-to files (not dir). --preserve-permissions preserve remote file permissions. Recursive download: -r, --recursive specify recursive download. -l, --level=NUMBER maximum recursion depth (inf or 0 for infinite). --delete-after delete files locally after downloading them. -k, --convert-links make links in downloaded HTML point to local files. -K, --backup-converted before converting file X, back up as X.orig. -m, --mirror shortcut for -N -r -l inf --no-remove-listing. -p, --page-requisites get all images, etc. needed to display HTML page. --strict-comments turn on strict (SGML) handling of HTML comments. Recursive accept/reject: -A, --accept=LIST comma-separated list of accepted extensions. -R, --reject=LIST comma-separated list of rejected extensions. -D, --domains=LIST comma-separated list of accepted domains. --exclude-domains=LIST comma-separated list of rejected domains. --follow-ftp follow FTP links from HTML documents. --follow-tags=LIST comma-separated list of followed HTML tags. --ignore-tags=LIST comma-separated list of ignored HTML tags. -H, --span-hosts go to foreign hosts when recursive. -L, --relative follow relative links only. -I, --include-directories=LIST list of allowed directories. -X, --exclude-directories=LIST list of excluded directories. -np, --no-parent don't ascend to the parent directory. Mail bug reports and suggestions to .
elinks是links(一个文本浏览器)的变种,对links做了增强。应该是目前最好的文本浏览器了。elinks的开发是目前几个文本浏览器中最活跃的,不断有新的版本出现。它的网站是:
本文发表时,其稳定版的版本是0.11.7。可以从此下载到0.11.6版本:
解开压缩包后,可以将其中的elinks.exe及其它的dll解压即可。不过我目前还没找到支持中文的方法。
以下是它的选项:
d:\>elinks --help ELinks 0.11.6 (built on Apr 5 2009 12:34:38) Usage: elinks [OPTION]... [URL]... Options: -anonymous [0|1] Restrict to anonymous mode -auto-submit [0|1] Autosubmit first form -base-session Clone internal session with given ID -config-dir Name of directory with configuration file -config-dump Print default configuration file to stdout -config-file Name of configuration file -config-help Print help for configuration options -default-mime-type MIME type assumed for unknown document types -default-keys [0|1] Ignore user-defined keybindings -dump [0|1] Print formatted versions of given URLs to stdout -dump-charset Codepage to use with -dump -dump-width Width of document formatted with -dump -eval Evaluate configuration file directive -force-html Interpret documents of unknown types as HTML -?, -h, -help Print usage help and exit -localhost [0|1] Only permit local connections -long-help Print detailed usage help and exit -lookup Look up specified host -no-connect [0|1] Run as separate instance -no-home [0|1] Disable use of files in ~/.elinks -no-numbering Disable link numbering in dump output -no-references Disable printing of link references in dump output -remote Control an already running ELinks -session-ring Connect to session ring with given ID -source [0|1] Print the source of given URLs to stdout -touch-files [0|1] Touch files in ~/.elinks when running with -no-connect/-session-ring -verbose Verbose level -version Print version information and exit
直接在命令行输入:
elinks 网址
即可用文本来浏览网站了。
分享到微信
打开微信,点击顶部的“╋”,
使用“扫一扫”将网页分享至微信。