September 19, 2013

Is it possible to use web accelerator to make websites faster?

Web accelerators used to be an important tool a long time ago, but they are falling out of fashion today. Why? There's no doubt the web today could use a boost in performance. I grind my teeth every time I wait for some blog post to display. Could accelerators help here?

The problem with all the old accelerators is that they were supposed to solve the problem of the last mile. Fortunately the last mile is no longer the bottleneck today. 30Mbps+ on downlink is common in cities. The bottleneck has shifted into application servers like WordPress or phpBB. The old accelerators cannot help here.

Last mile is still a bit of a problem in mobile networks and in rural areas. Browsing through the offerings of existing web accelerators, it seems that this is exactly the market they are aiming for. All the old web accelerators have transformed into mobile Internet accelerators. They are solving problems of the past though. Mobile Internet is getting faster and more ubiquitous every year.

One could avoid visiting the slow servers, but that's only possible with frequently visited sites that have plenty of alternatives. Unfortunately it doesn't help with the majority of pages that I visit through search. These pages, mostly small blogs, are visited usually only once. They host unique content that cannot be found elsewhere.

Yelling at webmasters to make their sites faster isn't going to help. They don't have any incentives to do so and trying to individually optimize every single site on the Internet is insanely expensive. It will never happen. I have to find some way to make these sites faster without changing them and without looking for alternatives.

A network of proxies can virtually eliminate round-trip times. One proxy is close to the client and one is near the server. Transmissions within the proxy network are highly optimized and virtually latency-free. Latency between proxy and its client or server endpoint is negligible. Unfortunately this kind of accelerator doesn't exist. It's just an idea in my head. And that's despite the fact that such network can be pretty cheaply built with just a few dozen servers around the world.

While latencies caused by TCP handshake and the initial window do slow down sites substantially, they are still quite small when compared to processing time in application servers. The only way to make these application servers faster is to fetch content from them ahead of the time.

There are client-side prefetch solutions, but they aren't very intelligent. The problem is that it isn't really possible to reliably predict user's decisions. And prefetching everything is unreasonably expensive. Not to mention that the little time this prefetch mechanism gets before I click on search result is not enough to overcome delays in application servers.

Web-wide caching service could do the job. It would prefetch every single blog post on the Internet and store it in a vast global cache. Clients would then get this cached copy, perhaps with subsequent refresh in case the cache is found to be stale after the fact. Since most of the content is still simple text, the recent improvements in hardware performance make this kind of cache increasingly feasible. Unfortunately nobody tried to build it yet.

Sure there are issues with such global cache. Most pages don't provide caching hints. There is a lot of personalized, constantly changing, and encrypted content. There are tons of CSS, JS, and other files that must be downloaded before the page gets displayed. There are many spam pages, database pages, and other low value content. And the web popularity curve has a long tail with very many pages that get a couple of visits per year. Yet I believe clever algorithms can be built to get around all these problems. The cache would still have to be very large. I keep wondering how big it would have to be to achieve at least 50% hit rate.

No comments:

Post a Comment