X-Google-Cache-Control URL Fetch Response Header

Google caches the results of URL Fetch requests so subsequent requests can be supplied from the cache, thereby speeding up the request and your application in general.

This can be troublesome though, especially if an application is accessing a web page that changes quickly; URL fetch may be returning stale results without the application understanding this. Fortunately, there’s a way to detect whether or not the page was retrieved from a Google cache server.

For all URL Fetch requests from the production App Engine servers, the X-Google-Cache-Control header is added to URL Fetch responses. If the header has a value of remote-fetch, then the fetch retrieved a fresh copy of the page. If the value is remote-cache-hit, then the page was retrieved from Google’s cache and may have stale data.

Here’s how the header will look like if it’s a cache hit:

X-Google-Cache-Control: remote-cache-hit

While a freshly retrieved page will have this header:

X-Google-Cache-Control: remote-fetch

Missing User Agent For Development Server URL Fetches

A quick note: the App Engine development server doesn’t add an User-Agent header for URL fetch requests.

As I commented in a previous post, the App Engine production environment automatically sets an User-Agent (listed below) to all URL Fetch requests. If you set a custom user agent, App Engine will append the below text to your custom header.

AppEngine-Google; (+http://code.google.com/appengine; 
appid: YOUR_APPLICATION_ID_HERE)

However, the development server doesn’t add this header automatically. If you set a custom User-Agent header, that’s all that will be sent – no other identifying information. If you don’t set an user agent, URL fetches from the development server will not have any user agent information.

This can be an issue while developing applications in the dev server; some APIs require the existence of this header, and will refuse to respond or heavily rate limit requests if this header is missing. For instance, the NewsBlur API requires an user agent header for all requests. If the request doesn’t contain an user agent header, the API will refuse the request even if it’s authenticated.

Always set a custom user agent header which accurately describes your application to all URL fetch requests. If your application does a lot of URL fetches to the same API/server, it may be a good idea to list your email address or a web page with more information about your application.

Error Parsing YAML File: While Scanning A Simple Key

App Engine uses the app.yaml file to route incoming requests to the appropriate handlers. It’s important to write proper YAML code in this file, otherwise your application may behave erratically or not at all.

One common problem with YAML files is failing to properly separate key:value pairs. The YAML specification requires a colon ( : ) and one space character between the key and the associated value. Here’s an example of a properly formatted YAML key:value pair:

Key: Value

Now here’s an example of a broken app.yaml file:

application: an-example-application-id
version: 1
runtime: php
api_version: 1
threadsafe:true

Notice the error? The threadsafe property has a colon, but no space separating the key ( threadsafe) and the value ( true ). Here’s a screenshot of appcfg refusing to upload this broken file:

If you receive this error, make sure that all of your YAML properties are separated by a colon and a space. One space is enough, don’t use tabs or multiple spaces.

Static File Referenced By Handler Not Found

The error static file referenced by handler not found is usually caused by an error in an application’s app.yaml. Basically, it means that one of the static file handlers in app.yaml is redirecting to a file that doesn’t exist or is named incorrectly.

Here’s a simple example. Suppose an application maps favicon.ico in this manner:

- url: /favicon.ico
  static_files: static/favicon.ico
  upload: static/favicon.ico

This handler statement says that the application has a folder named static, which holds a file named favicon.ico. But it maps the file so it looks like it’s located at the root of the application: example-id . appspot . com / favicon.ico. Now if the folder static doesn’t exist, or the file is missing, then attempting to access the file via the web will cause this error. Here’s how it looks in App Engine logs:

To fix, review the handlers section of app.yaml and make sure that the referenced files exist within the application folder.

Bulk Adding Headers To An URL Fetch Request

A quick code example: how to easily add headers to an URL Fetch request.

First, create a java.util.Hashtable:

Hashtable<String, String> request_headers;
request_headers = new Hashtable<String, String>();

Put the headers you want into this hashtable. The keys and values of this hashtable will become the header names and values in the fetch request.

When you’re configuring the URL Fetch request, use the code below to add in all the headers:

Enumeration<String> set_header_keys = request_headers.keys();
while (set_header_keys.hasMoreElements()) {
    String key = set_header_keys.nextElement();
    String value = request_headers.get(key);
    connection.setRequestProperty(key, value);
}

The connection variable represents a java.net.HttpURLConnection object.

URLFetch User Agent

When an application makes an URLFetch request, App Engine adds the following text as the User-Agentheader:

AppEngine-Google; (+http://code.google.com/appengine; 
appid: YOUR_APPLICATION_ID_HERE)

Even if the application sets a custom user agent header, App Engine will append the above text to the header.

This can be annoying because there are some servers and services that rate limit based on the user agent. If there is a human reviewing the request logs, it can be confusing to see a stream of largely-identical user agent strings.

It’s good practice to set a descriptive user agent for all URL fetches. It’s even better if you can write your user agent with App Engine’s required text in mind. For instance, consider writing user agent headers like this one: App Engine Example Crawler hosted by. When App Engine appends its required text to the end of this, the receiving server will see an user agent of:

App Engine Example Crawler hosted by
AppEngine-Google; (+http://code.google.com/appengine; 
appid: YOUR_APPLICATION_ID_HERE)

This user agent header looks cleaner, neater, and is easier for a human to understand.

Here is the above in code form:

String user_agent = "App Engine Example Crawler";
user_agent += " hosted by ";//After this, GAE will append the identifier.
connection.setRequestProperty("User-Agent", user_agent);

The connection variable represents a java.net.HttpURLConnection object.

Setting The Content Type In Java

In most cases servlets are used to generate and serve HTML pages. However, servlets can serve any type of data including images, plain text, PDFs, spreadsheets, Javascript code, etc. To do this, the servlet must declare the type of content being served to the web browser. This declaration is called the content type or the media type of the file.

Here’s how to set the content type of a servlet’s response. The variable resp represents a javax.servlet.http.HttpServletResponse object:

resp.setContentType(content_type);

Put the appropriate content type in the content_type variable. Some common content types are text/html(for HTML pages), text/plain (plain text), application/javascript (JS code), application/vnd.ms-excel (Excel spreadsheets), image/jpeg (JPEG images), application/pdf (PDFs); the list goes on and on. If you need to figure out the appropriate content type for your data, look it up on the Wikipedia Internet Media Types list.

Logging API Example

Here’s an example of how to use the Logging API to inspect an application’s logs. This function extracts all logs dated within the last hour, averages the execution time of all requests, and records how many requests resulted in errors (in other words recorded a FATAL level log report).

The String this function returns will look like this:

In the last 1 hour, requests took an average of 
451255 microseconds (451 ms) to execute. 0 errors reported.

This function is entirely self-contained; you don’t need to pass in any variables or set any global variables. However it needs to be run within an App Engine environment, either production or development.

public String doLogExam() {
    /**
     * For this example, we'll look into this application's logs, pull out the last few logs, 
     * and calculate the average length of servlet requests. 
     * We'll also examine each line of the log and see if there are any errors reported.
     */
    //Access the log service, pull out logs, and grab an Iterator over the logs.
    LogService log = LogServiceFactory.getLogService();
    //Get all logs starting from 1 hour ago.
    long end_time = (new java.util.Date()).getTime();
    long start_time = end_time - (1000 * 60 * 60 * 1);
    LogQuery q = LogQuery.Builder.withDefaults().includeAppLogs(true).startTimeMillis(start_time).endTimeMillis(end_time);
    Iterator<RequestLogs> log_iterator = log.fetch(q).iterator();
    //Holds the sum of the execution time of all HTTP requests; we'll divide this by the 
    //num_of_logs_counted to get the average execution time.
    long execution_time_microseconds_sum = 0;
    //Number of log lines that are reporting errors.
    int num_of_error_log_lines = 0;
    //Number of logs that we pulled out of the LogService
    int num_of_logs_counted = 0;
    //Iterate over each log.
    while (log_iterator.hasNext()) {
        //Each request_log represents a single HTTP request.
        RequestLogs request_log = log_iterator.next();
        num_of_logs_counted++;
        //Retrieve the execution time of this request, and add it to our sum variable.
        long execution_time_microseconds = request_log.getLatencyUsec();
        execution_time_microseconds_sum = execution_time_microseconds_sum + execution_time_microseconds;
        //Pull out any lines in this request log, and examine them to see 
        //if they report an error.
        List<AppLogLine> log_line_list = request_log.getAppLogLines();
        for (int i = 0; i < log_line_list.size(); i++) {
            AppLogLine app_log_line = log_line_list.get(i);
            LogService.LogLevel line_level = app_log_line.getLogLevel();
            //If this log line's reporting level is classified as fatal 
            //(causing the request to fail), record it.
            if (LogService.LogLevel.FATAL.equals(line_level)) {
                num_of_error_log_lines++;
            }
        }//end looping through each line of the request log
    }//End looping through each request log
    long avg_execution_time_microsec = (execution_time_microseconds_sum / num_of_logs_counted);
    long avg_execution_time_millisec = avg_execution_time_microsec / 1000;
    String comment_text = "In the last 1 hour, requests took an average of ";
    comment_text += avg_execution_time_microsec + " microseconds (" + avg_execution_time_millisec;
    comment_text += " ms) to execute. " + num_of_error_log_lines + " errors reported.";
    return comment_text;
}

Remember to import the following classes:

import java.util.Iterator;
import java.util.List;
import com.google.appengine.api.log.AppLogLine;
import com.google.appengine.api.log.LogQuery;
import com.google.appengine.api.log.LogService;
import com.google.appengine.api.log.LogServiceFactory;
import com.google.appengine.api.log.RequestLogs;

Checking For A Twitter Follow

This brief code example checks to see whether user_to_check (a Twitter username as a String) is following the Twitter account twitter_username. The boolean is_user_following will be true if there is a follow.

The twitter object represents a twitter4j.Twitter class preconfigured with authentication details.

Relationship to_other_user = twitter.showFriendship(twitter_username, user_to_check);
boolean is_user_following = to_other_user.isTargetFollowingSource();