Delete Old Entities – Java Datastore

This is an ultra-simplified example of how to delete old entities from the App Engine Datastore. The first 3 lines of code retrieves the current date, then subtracts 60 days from the current time (the multiplication converts days to milliseconds). DATE_PROPERTY_ON_ENTITY is the date property on the entity – when first writing the entity to the datastore, add the current date as a property. ENTITY_KIND is the entity kind we’re deleting.

		//Calculate 60 days ago.
		long current_date_long = (new Date()).getTime();
		long past_date_long = current_date_long - (1000 * 60 * 60 * 24 * 60);
		Date past_date = new Date(past_date_long);
		
		DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
		Query.Filter date_filter = new Query.FilterPredicate("DATE_PROPERTY_ON_ENTITY", Query.FilterOperator.LESS_THAN_OR_EQUAL, past_date);
		Query date_query = new Query("ENTITY_KIND").setFilter(date_filter);
		PreparedQuery date_query_results = datastore.prepare(date_query);
		
		Iterator<Entity> iterate_over_old_entities = date_query_results.asIterator();
		
		while (iterate_over_old_entities.hasNext()) {
			Entity old_entity = iterate_over_old_entities.next();
			
			System.out.println("Deleting: " + old_entity.getProperties());
			
			datastore.delete(old_entity.getKey());
		}

Note that is a simplified function – it’s useful if you have a handful of entities that need deleting, but if you have more than a handful, you should convert to using datastore cursors and paging through entities to delete.

PHP Post To PubSub

Today is a rather large fragment demonstrating how to post to Google PubSub. While there are libraries to handle this, I prefer to understand the low-level process so debugging is easier.

Note that this fragment is designed to run on App Engine, as it relies on the App Identity service to pull the credentials required to publish to PubSub. You only need to set up 3 variables: $message_data, which should be a JSON-encodable object, NAMEOFGOOGLEPROJECT, which is the name of the Google project containing the pubsub funnel you want to publish to, and NAMEOFPUBSUB which is the pubsub funnel name.

It isn’t required, but it is good practice to customize the User-Agent header below. I have it set to Publisher, but a production service should have it set to an appropriate custom name.

use google\appengine\api\app_identity\AppIdentityService;

//Build JSON object to post to Pubsub

$message_data_string = base64_encode(json_encode($message_data));

$single_message_attributes = array ("key" => "iana.org/language_tag",
    "value" => "en",
);

$single_message = array ("attributes" => $single_message_attributes,
    "data" => $message_data_string,
);
$messages = array ("messages" => $single_message);

//Post to Pubsub

$url = 'https://pubsub.googleapis.com/v1/projects/NAMEOFGOOGLEPROJECT/topics/NAMEOFPUBSUB:publish';

$pubsub_data = json_encode($messages);

syslog(LOG_INFO, "Pubsub Message: " . $pubsub_data);

$access_token = AppIdentityService::getAccessToken('https://www.googleapis.com/auth/pubsub');

$headers = "accept: */*\r\n" .
    "Content-Type: text/json\r\n" .
    "User-Agent: Publisher\r\n" .
    "Authorization: OAuth " . $access_token['access_token'] . "\r\n" .
    "Custom-Header-Two: custom-value-2\r\n";

$context = [
    'http' => [
        'method' => 'POST',
        'header' => $headers,
        'content' => $pubsub_data,
    ]
];
$context = stream_context_create($context);
$result = file_get_contents($url, false, $context);

syslog(LOG_INFO, "Returning from PubSub: " . $result);

Google Doodle

Today’s Google Doodle celebrates the 57th birthday of “Crocodile Hunter” Steve Irwin, who was a famous wildlife conservationist, zookeeper, and TV personality.

This is what the Google homepage looked like with the doodle:

Clicking on it goes to a slideshow showcasing many aspects of Steve Irwin’s life.

Many other organizations are also taking the opportunity of celebrating Steve’s life, such as Animal Planet on Twitter:

Serializing A Java Object To Google Cloud Storage – Java

A quick code example today: serializing a Java object to Google Cloud storage. write_object stands for the object being written. This code depends on the App Engine libraries for Java, and the Google Cloud Storage libraries.

	GcsService gcs_service = GcsServiceFactory.createGcsService();
	GcsFilename gcs_filename = new GcsFilename(AppIdentityServiceFactory.getAppIdentityService().getDefaultGcsBucketName(), "subfolder_within_bucket" + "/" + "filename.extension");
	GcsFileOptions gcs_options = GcsFileOptions.getDefaultInstance();
	GcsOutputChannel output = gcs_service.createOrReplace(gcs_filename, gcs_options);
	ObjectOutputStream oos = new ObjectOutputStream(Channels.newOutputStream(output));
	oos.writeObject(write_object);
	oos.flush();
	oos.close();
	output.close();

Python SQLite Table Creation Template

I often use the Python sqlite3 module: it helps save time during development as it’s a lightweight SQL engine. Even in production, some small applications can get away with running SQLite instead of a more normal SQL application.

To create a table in sqlite:

import sqlite3
def create_table():
    create_table_sql = """CREATE TABLE tweets (id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL UNIQUE,
     posted_date DATETIME, tweet_text VARCHAR(300),
     user VARCHAR(20), retweet_count int, favorite_count int,
     original_tweet_id VARCHAR(20)
     original_user VARCHAR(20));"""
    conn = sqlite3.connect("example.db")
    c = conn.cursor()
    c.execute(create_table_sql)
    conn.commit()
    conn.close()

And to execute operations against the created table, you simply need to connect to example.db and run c.execute:

# Execute into sqlite
conn = sqlite3.connect("example.db")
c = conn.cursor()

Amazon Error Pages

As I’ve said before, I love documenting error pages from popular web sites: they often have a sense of humor or show off another face of the company.

Here’s an example of an Amazon error page. What a handsome looking dog!

An Amazon error page.

Listening To Music Offline With Google Drive

Do you need to transfer text/music/pictures from your desktop/laptop PC to your iPhone? Do you need these files available to look/listen to even when your iPhone can’t get a signal?

I frequently need to transfer audio files/music from my laptop and listen to them on my iPhone, even in areas that don’t have cell reception. Fortunately, Google Drive offers the ability to mark files as available offline – to download the files to the iPhone’s local memory so they’re available at all times.

To do this, first use your PC to upload files to Google Drive. Then on the iPhone, open up the Google Drive app, find the audio file, and click on the three period symbol (inside the purple box) below:

Google Drive iOS app. Click on the three dot symbol.

A context menu will pop up below:

Google Drive file context menu.

Use your finger to pull the menu up (towards the top of your phone). You’ll see the full menu. Where it says Available Offline, pull the switch to the right until you see blue.

Use the available offline switch to mark the file as available at all times.

You’re done! A prompt should show up, where Drive acknowledges the offline request:

The notice that pops up when the file will be available offline.

To make sure the file is fully downloaded, leave the Google Drive app open a moment before you close it out.

DisallowedHost – Invalid HTTP_HOST Header “example.appspot.com”

When installing a new Django app on App Engine, I forgot to set the hosts header, and found this error:



The problem here is that I didn’t set the allowed hosts header in the settings file. To fix this, go to settings.py file and set ALLOWED_HOSTS to list a wildcard, like so:

ALLOWED_HOSTS = ['*']

You can see this replicated in Google’s own Django example: https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/appengine/standard/django/mysite/settings.py#L46 (see screenshot below).

A screenshot of allowed_hosts including a wildcard.

Warning: A wildcard is only acceptable because App Engine’s other security policies are protecting the application. If you move the Django site to another host, make sure you change out the wildcard to the appropriate domain/domains.

Amazon Cloud9 IDE Python 3

I spent today morning pulling my hair out trying to get a Python 3 application running within AWS’ Cloud9 IDE. Essentially the problem is that if you pip install some-package, that package only becomes available within the Python 2 runner, not within the Python 3 runner (the current Cloud9 machine image includes Python 2 aliased as python, and Python 3 available through python3).

Fortunately for me, someone else had the same problem, and an Amazon staffer suggests creating a virtual environment: https://forums.aws.amazon.com/thread.jspa?messageID=822214&tstart=0 . I followed the steps mentioned, but it didn’t work – pip still installed dependencies for Python 2. I wonder if this staffer forgot to mention another step, such as remapping the pip command to a Python 3 install.

What finally did work was calling the pip command through the python3 interpreter, like so:

First, ensure pip is installed:

python3 -m ensurepip --default-pip    

Then upgrade pip:

python3 -m pip install --upgrade pip setuptools wheel

Then you can install any requirements of your app, such as BeautifulSoup:

sudo python3 -m pip install --upgrade bs4

And after all that, my Python 3 app was able to access the BeautifulSoup dependency.