Urlprobe forgotten content


    TLDR: I made forks of gahttp and urlprobe to enable setting the host header for requests.

    I’ve made a pull request for gahttp and if that comes through I’ll do one for urlprobe too. While that’s baking you can get the host header enhanced urlprobe here.

    What, you ask, made me do this?

    Well, I found an old sitemap.xml on a host I was bughunting. It has thousands of urls pointing to a domain that the host I was looking shouldn’t serve any more, according to the dns entries.

    I had a hunch that all that content was still served by the host I was looking at even though the dns records nowdays point somewhere else.

    So having a long list of urls to check I thought urlprobe would do the trick, but I can’t just replace the host part of the urls with an ip address. I’ll have to set the host header too. That way my, probaly orphaned, taget will be tricked into beleiving it’s the caretaker of my in-scoped domain.

    Since urlprobe couldn’t set the host header I started out forking it to make the changes I wanted. I immediately found that the best place to do the actual code change was in the gahttp library by tomnomnom so of to do another fork.

    And here we are: If you have some old forgotten content you want to probe use the newly added -h flag on urlprobe.

    It will replace the host part of all your source urls with the given host and put the host part from the original urls into the host header of you get requests.