obtain single website url from a bunch of http packets?

432 Views Asked by At

I'm newbie in network programming so please forgive me any mistakes.

I'm writing a simple sniffer, which should detect just URLs of websites requested by the user. I'm using pcap.net and I'm able to capture http packets (with tcp port 80 filter) and retrieve data from them. What I can't do is getting a single URI for the request which caused many http packets to come.

For example, 1. a user requests (from a browser) www.website.com 2. many http responses come, one of which is text/html for www.website.com 3. www.website.com contains resources from other html pages, so many other packets from other hosts are coming.

Is there a way to ignore the packets from the resources? Do I have to make some tcp session reconstruction? I've been googling for 2 days but couldn't find anything useful, so please help.

1

There are 1 best solutions below

0
On

The HTTP responses from other hosts can be identified since they would probably come from different IPs, and not the IP that the request was sent to.

You can match HTTP requests and responses even without full TCP reconstruction by just looking at the IPs and TCP ports.

However, if you have multiple HTTP requests in the same TCP session, you will need to do TCP reconstruction to separate between the different requests and responses.