AMP

Introducing Cloudflare AMP Real URL

Developer Experience, Signed Exchanges

The following post was written by Zack Bloom, Director of Product and John Graham-Cumming, CTO at Cloudflare

Join us for a free webinar this week where we will cover how Cloudflare is helping its customers drive better results with AMP, and increasing conversions.

The promise of the AMP project was that it would make the web, and, in particular, the mobile web, much more pleasant to surf. The AMP framework was designed to make web pages load quickly, and place the focus on the user experience.

Initially, it was particularly aimed at publishers (such as news organizations) that wanted to provide the best, fastest web experience for readers catching up on news stories and in-depth articles while on the move. It later became valuable for any site which values their mobile performance including e-commerce stores, job boards, and media sites.

As well as the AMP HTML framework, AMP also made use of caches that store copies of AMP content close to end users so that they load as quickly as possible. Although this cache makes loading web pages much, much faster they introduce a problem: An AMP page served from Google’s cache has a URL starting with https://google.com/amp/. This can be confusing for end users.

Users have become used to looking at the navigation bar in a web browser to see what web site they are visiting. The AMP cache breaks that experience. By serving the page from Google’s cache they may be confused by the ‘google.com’ URL they see in the address bar.

Last November we at Cloudflare announced a technical solution to these problems that would allow AMP pages to be served from a cache while retaining the original page URL and all its benefits. The in-depth technical blog post by Gabbi Fisher and Avery Harnish gives the full details. The solution makes use of Web Packaging (which incorporates some clever use of cryptography) to allow the cache (run by Google, Cloudflare or others) to keep a copy of an AMP page and serve it quickly to the end user, but to also contain cryptographic proof of where the page originally came from.

In cooperation with a browser that understands Web Packaging, this means that a page can be stored in an AMP cache and served quickly from it while showing the original site URL in the browser’s navigation bar. A major win all round!

We’re calling this “AMP Real URL” and it’s free to all Cloudflare customers.

How It Works

Google’s AMP Crawler downloads the content of your website and stores it in the AMP Cache many times a day. If your site has AMP Real URL enabled Cloudflare will digitally sign the content we provide to that crawler, cryptographically proving it was generated by you. That signature is all a modern browser (currently just Chrome on Android) needs to show the correct URL in the address bar when a visitor arrives at your AMP content from Google’s search results.

More importantly, your site is still being served from Google’s AMP cache just as before; all of this comes without any cost to your SEO or web performance.

Want to Learn More?

We will be hosting a webinar on Wednesday, June 19th at 10:00 am PDT “Building AMP Experiences that Drive Results” that will feature a broader understanding of AMP architecture and how AMP Real URL leverages Cloudflare’s serverless platforms to deliver signed exchanges. U.S. Xpress will also share insights and results that they’ve experienced having deployed Cloudflare AMP Real URL for their AMP content.

Register today!

Signed-Exchange: Solving the AMP URLs Display Problem

Signed Exchanges

Editor’s note: the below was originally posted on Medium by Sarang Khanna, Software Engineer, OYO

For an AMP Page at URL example.com/awesome-amp-page, your SEO users are probably seeing this — google.com/amp/s/example.com/awesome-amp-page. At OYO, we solved the much dreaded AMP-Cache URL display problem for good, without losing any AMP benefits!

As a fast and responsive web pattern, AMP has taken over the web. But a major concern about AMP pages is that whenever users land on the page from a Google search, the displayed URL is never the publisher’s original URL. This has been a topic of both discussion and concern till now. Signed-Exchange is something which can help you show your own domain in AMP page URLs, with all the AMP-Cache capabilities intact.

AMP URLs can now be the page’s original domain (notice the address-bar)

TLDR;

Prelude AMP enjoys caching benefits on Google’s Search pages. Opening an AMP page from Google Search Results (the ones with the symbol) shows a “pre-fetched” version of the page coming from Google’s own cache. Hence, it opens with lightning-fast speed, as there is absolutelyno document-fetch done over the network upon clicking. (Read: OYO’s starting steps with AMP)

The issue However, since the content is fetched from Google’s Cache Server rather than the Publisher’s actual server, the URL shown to the users for that page starts with something like google.com/amp/s/,and is not the page’s actual URL.

Before the Fix: An AMP page opened from Google Search Results

This leaves chances of doubt about the authenticity of the page’s content for the end-user. However, now there is a way to solve this issue and show the page’s actual URL, even when it’s served from Google Cache.

The solution 

The solution resides in implementing Signed Exchange — which basically lets you “allow” third-parties (like AMP-Cache) to be able to serve your page’s content from their servers but the browser can show your domain and URL in the address bar.

What is Signed Exchange and how can it help? 

Signed Exchange (or “SXG”) is an emerging technology which provides a way to prove the authenticity of a web document. This can be used to determine a page’s original publisher, no matter where the document is served from.

A publisher can “sign” an HTTP request-response pair with their domain’s certificate. Thus generated signed-exchange can be served to browsers, similar to web pages! With it, the browser can safely show the publisher’s URL in the address bar because the signature proves that the content originally came from the publisher’s origin.

Simply put,

Signed-Exchange = A Web Page + Certificate of its Origin

This can allow us to improve AMP URLs. The idea here is to serve a signed-exchange for Google’s AMP crawler to cache, instead of a web document, when your AMP page is crawled.

Subsequently, now when a user will select your page from Google Search, the cached signed-exchange will be served from the AMP-Cache to the client’s browser . It will allow the browser to show the actual page URL (even though the page came from a Google Cache server).

(How HTTP Signed Exchange works to deliver better AMP URLs)

See it in action. 

Google has announced the support for Signed Exchanges in Chrome and has adopted it for the benefit of AMP on it’s search result pages. The current browser support is Chrome 73 and above.

So go ahead on Google, search an OYO city and look for the lighting-symboled AMP pages. Keep an eye on the address bar to see the magic of SXGs.

The following section explains how we enabled Signed-Exchange to have actual URLs displayed for the AMP pages being served from Google…

Implementing Signed-Exchange to show your own domain in AMP URLs. 

  1. Get a Digital Certificate for your domain
  2. Set-up a packager (signer) for your AMP pages (See deploy-amppackager-aws for help!)
  3. Proxy your AMP pages through the packager before serving them
  4. Profit!

1. Get a Digital Certificate for your domain

For signing the HTTP request-responses for AMP pages, you need to get a Digital Certificate issued for your domain.
Note: The private key must be ECDSA, curve secp256r1. And the certificate should have the CanSignHttpExchanges extension enabled for production use.

Psst! For testing purposes, you can use free certificates or self-signed certificates. For production use, DigiCert issues the certificate with the needed extension.

For reference, here is how to generate an EC P-256 key and a Certificate Signing Request (CSR) which you can submit to CA for signing:

## To generate a new EC P-256 private key
openssl ecparam -genkey -name secp256r1 | openssl ec -out privkey.pem
## Generate a CSR using the key:
openssl req -new -sha256 -key privkey.pem -nodes -out ec.csr -outform pem

Give the CSR to a CA who will issue you a new certificate for the private key. For SXGs, you will need the privkey.pem and the issued Certificate Chain, fullchain.pem.

2. Set-up a packager (signer) for your AMP pages

Next, you need to run a “Packager” server, something which will sign (or, “package”) the required pages using your certificate and it’s private key.

Thankfully, the AMP team has come-up with the amppackager tool, a Golang server for this purpose which anybody can use with minimal configurations (GitHub repo).

All you need to do is fetch the tool, provide your configurations (the path to your certificate, your domain name and the URLs to sign) and run it. On local machine, you can try it out with a demo request and it should serve a signed-exchange for your page!

AMPPackager Basics: By default, amppackager listens on port 8080. Suppose it is running on localhost:8080, it serves the signed-exchanges on URLs of this format:

localhost:8080/priv/doc/<Document's URL>

For example, the signed-exchange for https://www.oyorooms.com/hotels-in-delhi/ (AMP Page) will be served at: 
localhost:8080/priv/doc/https://www.oyorooms.com/hotels-in-delhi/

It serves other resources like certificate information on the URLs of this format:

localhost:8080/amppkg/<Path to Resource>

Note: Productionising the amppackager tool is still very vague.
We have open-sourced a boilerplate deployer (GitHub: deploy-amppackager-aws), which can be used to deploy amppackager on AWS Elastic Beanstalk! Use it for reference on how we streamlined the flow and put it to good use on OYO production.

3. Proxy your AMP pages through the packager before serving them

Upto this point, we have an amppackager server running to sign our AMP page requests. Time to look into which pages need to go through the packager and under what conditions.

Let’s address the exact requirements here (for the amppackager tool):

  • All requests starting with /amppkg/ path should be forwarded to the amppackager server unmodified.
  • For AMP page requests having the AMP-Cache-Transform header present (Google Crawler will have it, to indicate that it accepts SXGs), rewrite the URL to prepend /priv/doc and forward the request to the amppackager.
  • While serving AMP pages, set the Vary response-header to AMP-Cache-Transform, Accept. (Google Crawler will look for this header before asking an SXG from you)

Here is an example for a frontend Nginx proxy with the amppackager running on an upstream server amp_pkgr:

###
### AMP Packager resources:


location /amppkg {
# to amppackager, unmodified
proxy_pass https://amp_pkgr;
}



###
### Pages which are AMP:


location ~* ^/(.*)/amp {


# check for the "AMP-Cache-Transform" header
# (which Google crawler requests will have)
# to selectively pass this request to amppacakger or
# your usual web server
if ( $http_AMP_Cache_Transform ) {
rewrite /(.*) /priv/doc/https://$host/$1 break;
proxy_pass
https://amp_pkgr;
break;
}



# add "Vary" response-header for normally served AMP pages
# (responses from amppackager will already have the header)
add_header Vary 'AMP-Cache-Transform, Accept';


# proxy pass to your usual AMP pages' server
proxy_pass https://website_server;


}

Troubleshooting

  • It may take Google’s Crawler some days to cache your AMP pages as Signed-Exchanges.
  • Avoid proxying non-AMP pages to amppackager.
  • Don’t expose amp_pkg to the outside world; keep it within your internal network.
  • Set up your amppackager server to be scalable as well as secure. It is a machine that will probably hold your certificate and private key too.

Links to infinity and beyond 

Hope you’ll enjoy the journey from google.com/amp/s to yourdomain.com as much as we did. Thanks for reading… Have a cookie!