AMP Conf 2019. April 17/18. Tokyo.
AMP

Signed-Exchange: Solving the AMP URLs Display Problem

Signed Exchanges

Editor’s note: the below was originally posted on Medium by Sarang Khanna, Software Engineer, OYO

For an AMP Page at URL example.com/awesome-amp-page, your SEO users are probably seeing this — google.com/amp/s/example.com/awesome-amp-page. At OYO, we solved the much dreaded AMP-Cache URL display problem for good, without losing any AMP benefits!

As a fast and responsive web pattern, AMP has taken over the web. But a major concern about AMP pages is that whenever users land on the page from a Google search, the displayed URL is never the publisher’s original URL. This has been a topic of both discussion and concern till now. Signed-Exchange is something which can help you show your own domain in AMP page URLs, with all the AMP-Cache capabilities intact.

AMP URLs can now be the page’s original domain (notice the address-bar)

TLDR; 🔍

Prelude AMP enjoys caching benefits on Google’s Search pages. Opening an AMP page from Google Search Results (the ones with the ⚡ symbol) shows a “pre-fetched” version of the page coming from Google’s own cache. Hence, it opens with lightning-fast speed, as there is absolutelyno document-fetch done over the network upon clicking. (Read: OYO’s starting steps with AMP)

The issue However, since the content is fetched from Google’s Cache Server rather than the Publisher’s actual server, the URL shown to the users for that page starts with something like google.com/amp/s/,and is not the page’s actual URL.

Before the Fix: An AMP page opened from Google Search Results

This leaves chances of doubt about the authenticity of the page’s content for the end-user. However, now there is a way to solve this issue and show the page’s actual URL, even when it’s served from Google Cache.

The solution 

The solution resides in implementing Signed Exchange — which basically lets you “allow” third-parties (like AMP-Cache) to be able to serve your page’s content from their servers but the browser can show your domain and URL in the address bar.

What is Signed Exchange and how can it help? 💡

Signed Exchange (or “SXG”) is an emerging technology which provides a way to prove the authenticity of a web document. This can be used to determine a page’s original publisher, no matter where the document is served from.

A publisher can “sign” an HTTP request-response pair with their domain’s certificate. Thus generated signed-exchange can be served to browsers, similar to web pages! With it, the browser can safely show the publisher’s URL in the address bar because the signature proves that the content originally came from the publisher’s origin.

Simply put,

Signed-Exchange = A Web Page + Certificate of its Origin

This can allow us to improve AMP URLs. The idea here is to serve a signed-exchange for Google’s AMP crawler to cache, instead of a web document, when your AMP page is crawled.

Subsequently, now when a user will select your page from Google Search, the cached signed-exchange will be served from the AMP-Cache to the client’s browser . It will allow the browser to show the actual page URL (even though the page came from a Google Cache server).

(How HTTP Signed Exchange works to deliver better AMP URLs)

See it in action. 🎬

Google has announced the support for Signed Exchanges in Chrome and has adopted it for the benefit of AMP on it’s search result pages. The current browser support is Chrome 73 and above.

So go ahead on Google, search an OYO city and look for the lighting-symboled AMP pages. Keep an eye on the address bar to see the magic of SXGs.

The following section explains how we enabled Signed-Exchange to have actual URLs displayed for the AMP pages being served from Google…

Implementing Signed-Exchange to show your own domain in AMP URLs. 🏆

  1. Get a Digital Certificate for your domain
  2. Set-up a packager (signer) for your AMP pages (See deploy-amppackager-aws for help!)
  3. Proxy your AMP pages through the packager before serving them
  4. Profit! 🎉

1. Get a Digital Certificate for your domain

For signing the HTTP request-responses for AMP pages, you need to get a Digital Certificate issued for your domain.
Note: The private key must be ECDSA, curve secp256r1. And the certificate should have the CanSignHttpExchanges extension enabled for production use.

Psst! For testing purposes, you can use free certificates or self-signed certificates. For production use, DigiCert issues the certificate with the needed extension.

For reference, here is how to generate an EC P-256 key and a Certificate Signing Request (CSR) which you can submit to CA for signing:

## To generate a new EC P-256 private key
openssl ecparam -genkey -name secp256r1 | openssl ec -out privkey.pem
## Generate a CSR using the key:
openssl req -new -sha256 -key privkey.pem -nodes -out ec.csr -outform pem

Give the CSR to a CA who will issue you a new certificate for the private key. For SXGs, you will need the privkey.pem and the issued Certificate Chain, fullchain.pem.

2. Set-up a packager (signer) for your AMP pages

Next, you need to run a “Packager” server, something which will sign (or, “package”) the required pages using your certificate and it’s private key.

Thankfully, the AMP team has come-up with the amppackager tool, a Golang server for this purpose which anybody can use with minimal configurations (GitHub repo).

All you need to do is fetch the tool, provide your configurations (the path to your certificate, your domain name and the URLs to sign) and run it. On local machine, you can try it out with a demo request and it should serve a signed-exchange for your page!

AMPPackager Basics: By default, amppackager listens on port 8080. Suppose it is running on localhost:8080, it serves the signed-exchanges on URLs of this format:

localhost:8080/priv/doc/<Document's URL>

For example, the signed-exchange for https://www.oyorooms.com/hotels-in-delhi/ (AMP Page) will be served at: 
localhost:8080/priv/doc/https://www.oyorooms.com/hotels-in-delhi/

It serves other resources like certificate information on the URLs of this format:

localhost:8080/amppkg/<Path to Resource>

Note: Productionising the amppackager tool is still very vague.
We have open-sourced a boilerplate deployer (GitHub: deploy-amppackager-aws), which can be used to deploy amppackager on AWS Elastic Beanstalk! Use it for reference on how we streamlined the flow and put it to good use on OYO production.

3. Proxy your AMP pages through the packager before serving them

Upto this point, we have an amppackager server running to sign our AMP page requests. Time to look into which pages need to go through the packager and under what conditions.

Let’s address the exact requirements here (for the amppackager tool):

  • All requests starting with /amppkg/ path should be forwarded to the amppackager server unmodified.
  • For AMP page requests having the AMP-Cache-Transform header present (Google Crawler will have it, to indicate that it accepts SXGs), rewrite the URL to prepend /priv/doc and forward the request to the amppackager.
  • While serving AMP pages, set the Vary response-header to AMP-Cache-Transform, Accept. (Google Crawler will look for this header before asking an SXG from you)

Here is an example for a frontend Nginx proxy with the amppackager running on an upstream server amp_pkgr:

###
### AMP Packager resources:


location /amppkg {
# to amppackager, unmodified
proxy_pass https://amp_pkgr;
}



###
### Pages which are AMP:


location ~* ^/(.*)/amp {


# check for the "AMP-Cache-Transform" header
# (which Google crawler requests will have)
# to selectively pass this request to amppacakger or
# your usual web server
if ( $http_AMP_Cache_Transform ) {
rewrite /(.*) /priv/doc/https://$host/$1 break;
proxy_pass
https://amp_pkgr;
break;
}



# add "Vary" response-header for normally served AMP pages
# (responses from amppackager will already have the header)
add_header Vary 'AMP-Cache-Transform, Accept';


# proxy pass to your usual AMP pages' server
proxy_pass https://website_server;


}

Troubleshooting 💣

  • It may take Google’s Crawler some days to cache your AMP pages as Signed-Exchanges.
  • Avoid proxying non-AMP pages to amppackager.
  • Don’t expose amp_pkg to the outside world; keep it within your internal network.
  • Set up your amppackager server to be scalable as well as secure. It is a machine that will probably hold your certificate and private key too.

Links to infinity and beyond 🎯

Hope you’ll enjoy the journey from google.com/amp/s to yourdomain.com as much as we did. Thanks for reading… Have a cookie! 🍪