- Print
- PDF
The HTTP metric offers a very synthetic notion of a page, which is a set of HTTP documents fetched by the same user and combined by their browser into a single object, a “page”. Reconstructing pages from the actual packets involves an unusually high number of operations and thus, deserves quite a detailed description.
HTTP Specific Glossary
Although not required to use Cisco Provider Connectivity Assurance Sensors (previously SkyLIGHT PVX), the following definitions are necessary to understand the following description:
- HTTP message: as defined by RFC, it is an HTTP header optionally followed by a body. Sniffing gives us some of the headers, the relevant timestamps, sizes, and so on. We may not see everything, but the beginning of the header is mandatory in order to recognize an HTTP message.
- HTTP query: HTTP message with a command (GET, POST, HEAD, etc) and the URL.
- HTTP response: HTTP message with a response code (sometimes called status code or status)
- HTTP hit or transaction: HTTP query with optionally its associated HTTP response (note: a response with no associated query is ignored for this metric).
- User: the HTTP client software (browser or whatever) that has sent the query under consideration. It’s identified by his IP address and user-agent field.
- Page: set of transactions that are supposed to be perceived as a single query implying a single delay for the user. Notice how subjective this definition is. The intent is to include, in a single page, all the hits required for a typical browser to display enough content for the typical user to think his query is fulfilled. For websites or browsers that delay download of content until it becomes visible, or for websites that display intermediary content, the only objective is to behave in a way that’s understandable.
- Root (of a page): the transaction that triggers other transactions for the same page, either directly or indirectly. We’d like it to be the first chronologically but that’s not necessarily the case due to mirroring.
From packets to HTTP messages
The sniffer receives fragments of HTTP messages. It starts to reconstruct a new HTTP message as soon as it receives the start of a header. Some fragments of the message may be missing, though, in which case it may be incapable of:
- Associating a body fragment to the proper HTTP message, thus leading to erroneous payloads and dubious chronology;
- Saving part of content in HTTP save files (without notice);
- Reporting the timestamp of message end.
From individual messages to transactions
HTTP offers no better way to associate response with corresponding query than to rely on ordering: first response of the socket with first query, and so on.
So, for every socket, the sniffer stores all queries not already paired with a response. Notice that on a socket, a proxy may mix queries of different users, and that two interconnected proxies may even mix queries to distinct servers.
Notice also how damaging a single dropped packet may be if it hides a query or a full response to the sniffer, since all pairing following this gap will be questionable.
Also, servers may not respond, leading to a timeout of the pending queries (which will be inserted in database without any response).
From transactions to pages
Since all transactions of a page are necessarily emitted by the same user, all transactions are associated to this user, in chronological order (time and the “Referrer” field are our two best tools from now on). Notice that because a page routinely involves transactions of several sockets, and since different sockets are reassembled by different TCP parsers which thus deliver segments at different paces, it’s possible for the HTTP metric to reconstruct a transaction A before a transaction B even if B happened and was received by the probe before A (for instance, if A’s socket reassembly was delayed by a missing frame). In such an occurrence, the referrer relation between A and B may not be honored.
We do not wait for the pairing with a response to attach a query to the page it belongs to. When we attach a new query to a client, we look for the referrer of this transaction within the ones that are already attached to this client (in case the referrer field is absent, we use the same kind of referrer cache as found in KSniffer). If the referred page is itself attached to another page, two behaviors are possible:
- We detach it, thus turning the referrer into the root of a new page, or
- We follow the chain of attachment and attach the new transaction to the parent page.
Note that the first behavior is possible only when the content-type of the referred page does not prevent it (i.e., is not typically reserved to non-root transactions, such as image, CSS, and other typically embedded content).
You can choose between these two behaviors with the http-detach-referred parameter.
The second behavior (keep referred transactions attached) is better when iframes are involved but it is believed that the first (and default) one generally leads to better results. Other than iframes, the only observed case where a referenced transaction was obviously not a page root was an AJAX POSTing to the same URL as referrer continuously, thus detaching its predecessor.
If or when we eventually receive the response of a transaction (and, hopefully, its content-type), we revise our judgment on the attachment. If the transaction seems to have not been triggered by AJAX, and its content-type is indicative of a standalone document (PDF, PS or HTML with status 200), then we detach it (turning it into a root). Otherwise, if the content-type is not indicative of a typically embedded content (image, CSS, etc.), we check the delay between the page root and this transaction and if found greater than a parameter (http-page-construction-max-delay), then it is detached as well.
To speed up information retrieval, some global per page values are precomputed in the sniffer: every transaction attached to a page contributes into the page as soon as it was received less than http-page-contribution-max-delay seconds after the root. All of these transactions will contribute to the page load time.
To be able to dump a root transaction with all of these counters, we must, of course, delay the dump of roots as late as possible, thus raising memory requirements.
Protections
To limit memory and CPU usage, the sniffer implements these protections:
- Page reconstruction is only active for some IP addresses and TCP ports (client or server). See the HTTP flag in the zone and application definition. All transactions that do not come from or goes to one of these IP addresses will not be attached to a root transaction. It will be inserted in the database but will be excluded from the page list.
- The total number of simultaneously tracked and remembered HTTP transactions is limited by http-max-tracked (unlimited by default). New transactions above this will be ignored (with catastrophic consequences to transaction pairing).
- The total number of simultaneously tracked and remembered HTTP transactions for which we want page reconstruction is limited by http-max-tracked-for-reconstruction (unlimited by default).
- Max size of HTTP save file is limited by http-max-content-size (50k by default).
- The memory dedicated to the referrer cache is limited by http-referrer-mem.
Limitations
Page load time is the most interesting metric, yet we have seen that many conditions must be met to accurately reconstruct pages.
- The process is very sensible to missing TCP fragments (retransmitted fragments cause no problem but fragments that are not mirrored to the probe do);
- The bigger the proxies, the less reliable client isolation will be;
- Some heuristics regarding AJAX, content types and timing do not necessarily match your sites;
- Some clients may successfully hide the referrer (or worse, we may guess the wrong referrer);
- HTTP analysis may consume more resources than what’s available (or configured);
- Any small inaccuracy in HTTP message reassembly or in transaction pairing will lead to highly inaccurate page load times.
© 2024 Cisco and/or its affiliates. All rights reserved.
For more information about trademarks, please visit: Cisco trademarks
For more information about legal terms, please visit: Cisco legal terms
For legal information about Accedian Skylight products, please visit: Accedian legal terms and tradmarks