Web Analytics Made Easy -
StatCounter WGET and the OPTIONS Indexes directive - CodingForum

Announcement

Collapse
No announcement yet.

WGET and the OPTIONS Indexes directive

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • WGET and the OPTIONS Indexes directive

    So I just discovered wget, and how powerful this tool potentially is. I would like to know how to safegaurd against it if it is at all possible. I am not really sure how it works; I just figured it out, and I am able to recursively download from a couple of my domains. I havn't tested it on my PHP code, just images, so I don't know how the server will actually send the PHP. As PHP code, or as HTML code that the PHP script outputs. If it is by HTTP protocol, I think it will just send the HTML markup but I am not sure.

    Will denying Indexes with the Options directive safeguard against wget or do I have to do some more advanced configuration? Help here is appreciated.

  • #2
    Originally posted by JamesOxford View Post
    Will denying Indexes with the Options directive safeguard against wget or do I have to do some more advanced configuration? Help here is appreciated.
    In general, unless you have an explicit need to list the files, you should disable indexing. Spiders can still crawl your pages to retrieve the images/files you use on them(wget can do this), but they can't get a list of everything in your folders and follow it recursively, if you disable the indexes. They also can't see the source of your PHP files because they are parsed by the server when they are requested. An exception would be if you named something .phps or an extension that is not handled by Apache(like .phpbak for example).

    To disable indexes for your site put this in an .htaccess in the document root:
    Code:
    Options -Indexes

    Comment


    • #3
      Again, thanks for your help. If I disable indexes in an .htacess file in the root directory, would I be able to override it in a sub-directory or no? There are a couple of places where indexes are convenient.

      In directories where I did want to index, would denying spiders in a robot.txt file, and setting a valid-user requirement with basic authentication be sufficient to to stop recursive downloads of the entire folder?

      Comment


      • #4
        Originally posted by JamesOxford View Post
        If I disable indexes in an .htacess file in the root directory, would I be able to override it in a sub-directory or no?
        Yep.
        Originally posted by JamesOxford View Post
        In directories where I did want to index, would denying spiders in a robot.txt file, and setting a valid-user requirement with basic authentication be sufficient to to stop recursive downloads of the entire folder?
        No, not really. robots.txt is more of a suggestion and only well-behaved spiders will follow it. You may just end up making it easier for people to find the directories you don't want indexed... so they can index them. That is, if you are worried about bad robots to begin with.

        Comment


        • #5
          At this point it is more of a hypothetical, than a true concern. The basic authentication won't stop them? Won't they get a 404 redirect instead of a 200 OK if they tried to access the directory without authenticating?

          Comment


          • #6
            Originally posted by JamesOxford View Post
            The basic authentication won't stop them? Won't they get a 404 redirect instead of a 200 OK if they tried to access the directory without authenticating?
            Whoops, I didn't see that you were adding authentication. That should be sufficient to block recursive indexing. They will get a 403 if they can't authenticate.

            Comment


            • #7
              404 redirect instead of a 200 OK if they tried to access the directory without authenticating?
              I meant 403 .

              Thanks again for all your help.

              BTW, how do I add the user I am quoting when I wrap text in {QUOTE}?

              Comment


              • #8
                Originally posted by JamesOxford View Post
                BTW, how do I add the user I am quoting when I wrap text in {QUOTE}?
                The easiest way is to hit the quote button at the bottom of the post, but you can use this format too:
                [QUOTE=JamesOxford]some text[/QUOTE]

                Comment

                Working...
                X