Thursday, January 26, 2023

Datastore preservation: migrating Python 2.7 to Python 3 on Google App Engine

Let's say you deploy a production web application on Google App Engine, using a Python 2.7 runtime. That's getting quite old today, and there are increasing numbers of incompatibilities you need to cope with as a result. What to do? 

It might seem daunting, and risky, to migrate to the rather different development environment of Google App Engine with the runtime of Python 3.x.

But if done properly, one worry, or risk, can be avoided. The data in datastore. If you have a great deal of it, it will be preserved during this migration, without the complications of an ETL process. If you translate your environment and webapp properly, the data doesn't go anywhere.

See for yourself.

Here are two webapp sequences, for the same simple three-tier application. 

The first is a Python 2.7 application. 

The second is a Python 3.x application. 

It's worth creating this project youself, so you can feel confident in the migration.

After this, when working on the migration of your production application, it's important to take precautions -- such as duplication of the codebase, and copying the datastore instance -- to allow rollback.

But you probably won't need it.

--------------

Sequence 1

A simple three-tier application on Google App Engine with the webapp2 framework and the python 2.7 runtime. We assume you've created your project and enabled billing, in your Google Cloud Console.

create index.html

<html>

<head>

<title><project id></title>

</head>

<body>

<div style="color:white;font-weight:bold;size:20px;">

<project id><br/><br/>

Visits {{visits}}

</div>

</body>

</html>


create <project id>.py

import cgi

import os

import webapp2

from google.appengine.ext.webapp import template

from google.appengine.api import users

from google.appengine.ext import db


class Visit(db.Model):

   visitor   = db.StringProperty()

   timestamp = db.DateTimeProperty(auto_now_add=True)


def store_visit(remote_addr, user_agent):

        Visit(visitor='{}: {}'.format(remote_addr, user_agent)).put()


def fetch_visits():

        return str(Visit.all().count())

    

class MainPage(webapp2.RequestHandler):

    def get(self):

        store_visit(self.request.remote_addr, self.request.user_agent)

        visits = fetch_visits();

template_values = { 'visits':visits}

path = os.path.join(os.path.dirname(__file__), 'index.html')

self.response.out.write(template.render(path, template_values))



app = webapp2.WSGIApplication(

          [('/', MainPage)],

            debug=True)


create app.yaml

runtime: python27

api_version: 1

threadsafe: false

handlers:

- url: /.*

 script: <project id>.app

 secure: always

 redirect_http_response_code: 301


run "python --version"

Python 2.7.4


test your app locally with:

dev_appserver.py --log_level debug .


deploy with:

gcloud app deploy --project <project id>

  

Visit the live page a few times, and find your datastore entities from Google Cloud Console.

---------------------------------------------

---------------------------------------------

  

Sequence 2

A simple three-tier application on Google App Engine with the flask framework and the python 3.8 runtime.

On your machine:

virtualenv -p python3.8.2 <project env>

source ./<project env>/bin/activate

pip install gunicorn

pip install flask

pip install google-cloud-datastore

        pip list

Create from the list results, include these two versions in a file called "requirements.txt":

Flask==2.2.2

google-cloud-datastore==2.13.2

  

[if you do this repeatedly, in different local virtual environments, you end up running "pip install -r requirements.txt" often]

create "app.yaml"

runtime: python38

  

create "main.py"

from datetime import datetime, timezone

import json

import time

import google.auth

from flask import Flask, render_template, request

from google.cloud import datastore


app = Flask(__name__)

ds_client = datastore.Client()


def store_visit(remote_addr, user_agent):

   entity = datastore.Entity(key=ds_client.key('Visit'))

   entity.update({

        'timestamp': datetime.now(timezone.utc),

        'visitor': '{}: {}'.format(remote_addr, user_agent),

   })

   ds_client.put(entity)



def fetch_visits():

        'get total visits'

        query = ds_client.query(kind="Visit")

        return len(list(query.fetch()))


@app.route('/')

def root():

    store_visit(request.remote_addr, request.user_agent)

    visits = fetch_visits()


    # NB: index.html must be in /templates

    return render_template('index.html',visits=visits)


if __name__ == '__main__':

    app.run()

 

create "templates/index.html":

<!doctype html>

<html>

<head>

<title><project id></title>

</head>

<body>

<h1><project id></h1>

<p>{{ visits }} visits</p>

</div>

</body>

</html>


set project with:

gcloud config set project <project id>

(Note this needs to be run whenever you switch to working on a different project locally, to switch the datastore your local development environment is connected to, that is, to the project's datastore.)


set credentials 

gcloud auth application-default login

(This will launch a browser -- log in, and 'allow.' Note this will connect your local environment with the remote datastore. If you want a local datastore, you need to use the datastore emulator).


run locally with:

gunicorn -b :8080 main:app

      

deploy with:

gcloud app deploy --project <project id>

      

deactivate

Check your database on google cloud console. It's still there!


No comments: