Tuesday, January 8, 2013

Netcat as a file downloader


     The netcat utility is a multi-purpose tool for managing and manipulating TCP/IP traffic.  In this article, we will see how netcat can be used as a file downloader.  This will come in handy when you don't have utilities like wget/fetch/curl installed in our machine.

     Netcat ( "nc" is the name of the binary ) can establish tcp connection to any server/port combination and send or receive data through the established channel.  To use it as a downloader, our strategy will be:
  • Establish a connection to the http port of the server.
  • Send an HTTP request with the download link to the established connection
  • Redirect the output of the HTTP response to a file ( which will be the download file).
     Let us try downloading apache httpd package from the url  http://apache.techartifact.com/mirror/httpd/httpd-2.4.3.tar.gz

     First, let us establish a TCP connection to port 80 of server apache.techartifact.com.  Command for this is:

/bin/nc apache.techartifact.com 80

     Second, let us construct an HTTP request. This can be done in two ways - using HTTP protocol 1.0 version and HTTP 1.1 version.

     A generic HTTP request format consists of:
  • A request line ( Further contains request methode - "GET"  for download , request URI - the whole/relative download URL , protocol version - HTTP/1.0 or HTTP/1.1 )
  • Multiple lines of HTTP headers ( Each HTTP header is a single line containing a header name and header value separated by a column and space )
  • An empty line
  • Message body
     Each of these lines will be separated by a Carriage Return ( \r ) and a Line Feed ( \n ) characters.

     Though there are many parts for an HTTP request, a bare minimum HTTP  request requires only the following:
  • A request line ( for both HTTP/1.0 and 1.1 version )
  • A host header ( only for HTTP/1.1 , Format is - Host: web.server.name )
  • A blank line
     All separated by a CR and CF ( "\r" & "\n" )
  • HTTP/1.0 request for our download URL is:  GET http://apache.techartifact.com/mirror/httpd/httpd-2.4.3.tar.gz HTTP/1.0\r\n\r\n
  • HTTP/1.1 request for our download URL is:  GET http://apache.techartifact.com/mirror/httpd/httpd-2.4.3.tar.gz HTTP/1.0\r\nHost: apache.techartifact.com\r\n
     When we sent this request, the response from the  server ( if everything is good and file start getting downloaded ) will contain an http response which begins with line "HTTP/1.1 200 OK" followed by multiple header lines, then followed by a blank line ( containing "\r" ) followed by the response data ( which is the actual file to be downloaded ).  So while saving the response to a file we should strip off the http header information part (all lines between and including "HTTP/1.1 200 OK" and "\r").  This can be achieved by a simple sed command.

     To learn more about HTTP visit this link

     Let us try downloading the file with HTTP/1.0:

safeer@penguinepower:/tmp$ echo -e "GET http://apache.techartifact.com/mirror/httpd/httpd-2.4.3.tar.gz HTTP/1.0\r\n\r\n"|nc apache.techartifact.com 80|sed '/^HTTP\/1.. 200 OK\r$/,/^\r$/d' > httpd-2.4.3-with-http-1.0.tar.gz

     Now with HTTP/1.1

safeer@penguinepower:/tmp$ echo -e "GET http://apache.techartifact.com/mirror/httpd/httpd-2.4.3.tar.gz HTTP/1.1\r\nHost: apache.techartifact.com\r\n"|nc apache.techartifact.com 80 | sed '/^HTTP\/1.. 200 OK\r$/,/^\r$/d' > httpd-2.4.3-with-http-1.1.tar.gz

     Let us also download the file with wget utility directly

safeer@penguinepower:/tmp$ wget -q http://apache.techartifact.com/mirror/httpd/httpd-2.4.3.tar.gz -O httpd-2.4.3-with-wget.tar.gz
     Now compare all the files downloaded to ensure they are all the same.

safeer@penguinepower:/tmp$ du -bs httpd-2.4.3-with-*
6137268 httpd-2.4.3-with-http-1.0.tar.gz
6137268 httpd-2.4.3-with-http-1.1.tar.gz
6137268 httpd-2.4.3-with-wget.tar.gz

safeer@penguinepower:/tmp$ md5sum httpd-2.4.3-with-*
538dccd22dd18466fff3ec7948495417  httpd-2.4.3-with-http-1.0.tar.gz
538dccd22dd18466fff3ec7948495417  httpd-2.4.3-with-http-1.1.tar.gz
538dccd22dd18466fff3ec7948495417  httpd-2.4.3-with-wget.tar.gz


Let us ensure the integrity of the downloaded files by comparing their md5 with the value given in apache website

safeer@penguinepower:/tmp$ curl -s http://www.apache.org/dist/httpd/httpd-2.4.3.tar.gz.md5
538dccd22dd18466fff3ec7948495417 *httpd-2.4.3.tar.gz

Everything looks good now.

Note: This command can download from servers on which the file is actually located ( on the given port and location as in the URL ).  I haven't tested the case where the the actual file is behind a proxy and the download url redirects you to the correct location ( with an HTTP 302 message).  That situation will need some more logic.





2 comments:

  1. Thank you! Safeer, this is helpful article to download from indirect links. ;)

    ReplyDelete
  2. Great post! Thanks for sharing!

    Máy đưa võng tự động hay võng điện cho em bé hay đưa võng tự động giúp bé ngủ ngon mà võng tự động không tốn sức ru võng. Võng tự động hay máy đưa võng chắc chắn, gọn gàng, dễ tháo xếp, dễ di chuyển và may dua vong tu dong dễ dàng bảo quản.
    Chia sẻ các mẹ bà bầu có nên dùng dầu gió không hay bà bầu có được cạo gió không hay bà bầu có nên ăn thịt chó hay trứng ngỗng cho bà bầu hay giải mã giấc mơ thấy người chết hay cách chống nắng bằng trà xanh hay Collagen trị mụn được không hay chữa mất ngủ bằng gừng đơn giản, bí quyết làm trắng da bằng cà phê và dầu dừa hay giảm cân nhanh bằng gạo lứt hq hay mẹo giúp tăng cường trí nhớ hiệu quả, những thực phẩm giúp cải thiện trí nhớ hiệu quả, hay bệnh viêm khớp không nên ăn gì hay mẹo giúp giảm độ cận thị cho bạn, bí quyết chống nắng với cà chua cực hiệu quả, cách giúp bé ngủ ngon giấcthực phẩm giúp bé ngủ ngon mẹ nên biết, chia sẻ cách làm trắng da toàn thân bằng thực phẩm, những món ăn chữa bệnh mất ngủ hiệu quả.
    Những thực phẩm giúp đẹp da tại http://nhungthucphamgiupda.blogspot.com/
    Thực phẩm giúp bạn trẻ đẹp tại http://thucphamgiuptre.blogspot.com/
    Thực phẩm làm tăng tại http://thucphamlamtang.blogspot.com/
    Những thực phẩm giúp làm giảm tại http://thucphamlamgiam.blogspot.com/
    Những thực phẩm tốt cho tại http://thucphamtotcho.blogspot.com/

    ReplyDelete