Handwriting Recognition as a Service

At a recent parent-teacher conference, my son’s teacher voiced his struggle in reading my son’s handwriting. This got me thinking about whether and how technology could assist my son and others like him. In consideration of an educational tablet app that enables people to improve their handwriting, I came across two Handwriting Recognition As A Service (HRAAS) APIs. One is free using Google Input Tools, and one is paid from Vision Objects.

I’m still waiting for Vision Objects to get back to me about a request for more information on their products. Unlike almost every other software company these days, they don’t seem to expose documentation on their products to the public. They want to know who you are and some details about your product before they let you see their APIs.

As for Google, I installed the Google Input Tools for Chrome extension, and I was able to sniff their XHR traffic by debugging the extension. I wrote a “J” in their writing area (using the mouse with Chrome on a PC) as depicted here:

google input tools

Here’s the HTTP POST request body that the extension makes to https://inputtools.google.com/request?itc=en-t-i0-handwrit&app=chext for the input above:

{
    "app_version": 0.4,
    "api_level": "537.36",
    "device": "5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36",
    "input_type": 0,
    "options": "enable_pre_space",
    "requests": [{
        "writing_guide": {
            "writing_area_width": 425,
            "writing_area_height": 194
        },
        "pre_context": "",
        "max_num_results": 10,
        "max_completions": 0,
        "ink": [
            [
                [92, 91, 91, 91, 91, 90, 90, 90, 89, 89, 88, 87, 86, 85, 85, 84, 83, 82, 81, 80, 79, 78, 77, 77, 76, 75, 75, 74, 72, 72, 71, 70, 69, 67, 66, 64, 61, 59, 56, 55, 53, 51, 49, 46, 44, 41, 39, 36, 34, 32, 31, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30],
                [55, 56, 58, 61, 64, 70, 73, 77, 80, 84, 90, 94, 98, 101, 104, 106, 108, 113, 116, 119, 122, 126, 128, 129, 131, 133, 134, 137, 139, 140, 142, 142, 144, 145, 146, 147, 147, 148, 149, 149, 149, 149, 149, 149, 148, 147, 144, 141, 137, 134, 131, 127, 123, 118, 109, 104, 101, 99, 98, 97, 96, 95],
                [0, 187, 203, 203, 219, 234, 250, 250, 265, 265, 281, 297, 297, 312, 328, 343, 343, 359, 375, 375, 390, 406, 406, 421, 437, 437, 453, 468, 484, 484, 499, 515, 515, 531, 531, 546, 562, 562, 577, 593, 609, 609, 624, 640, 640, 655, 671, 671, 687, 702, 702, 718, 733, 749, 765, 765, 780, 796, 796, 811, 843, 858]
            ]
        ]
    }, {
        "feedback": "∅[deleted]",
        "select_type": "deleted",
        "ink_hash": "18d06cfd82f0175f"
    }]
}

and here’s the response:

[
    "SUCCESS",
    [
        [
            "05c9c4d707af74e1",
            [
                "J",
                "j",
                "I",
                "ij",
                "ji",
                "li",
                "il",
                "Ji",
                "ii",
                "jr"
            ],
            [],
            {
                "is_html_escaped": false
            }
        ],
        [
            "18d06cfd82f0175f",
            [],
            [],
            {
                "is_html_escaped": false
            }
        ]
    ]
]

The next step is to see how hard it is to generate the list of coordinates for the request. I don’t know if it’s still relevant, but there’s an old version of the deobfuscated code at http://ctrlq.org/code/19205-google-handwriting-api.

Advertisements
Posted in Uncategorized | Tagged , | Leave a comment

AngularJS, Meet WordPress

Having coded my first web pages almost half a lifetime ago in 1995, I’ve seen my share of server side development patterns and frameworks: CGI scripts, NSAPI, ISAPI, Apache API, ASP(.NET), PHP, JSP, Rails, Grails, et al. Although the idea of server side javascript (SSJS) is not new (I think Netscape Enterprise Server first enabled its use), the node.js community and its offshoots that has evolved javascript (JS) into a full stack development is by far the fasting moving development community that I’ve seen. It seems that there are new frameworks clamoring for attention weekly, if not daily. The pace of innovation seems to be increasing exponentially compared to the pre-node.js web.

The innovation in SSJS triggered a new wave of innovation in clients-side JS (CSJS). For greenfield projects, the purist JS community advocates using JS on the client and server sides for Single Page Apps (SPAs). There are many existing capable server-side frameworks with large, established communities, so why has SSJS taken the development community by storm? The most compelling argument for me is the ability to code my app logic in one language. HTML and CSS are a given and logic often leaks into them, so we’re still talking about at least 3 languages to build an app, but the point is being able to use one less language to build my app.

Despite the incredible momentum of the purist camp, there’s still a huge base of applications out there written in PHP, ASP.NET, JSP. Old guys like me can’t compete with the purists’ pace of innovation. We can still add value though, by linking the old world pre-SSJS with the new SSJS world.

For example, PHP and CMS’s built on top of it like WordPress and Drupal with installations and communities in the millions aren’t going away soon. Out of the box they don’t yet leverage the new CSJS frameworks. Angular.js was developed by Google and is probably the hottest CSJS framework. How can WordPress leverage the power of Angular.js?

As with many of us in the Internet-connected world, I read far too much of my email on my phone. Tonight as I read the daily deals from one of my favorite WordPress blogs Dan’s Deals, I was frustrated by how slow the pages loaded as I progressed sequentially from pages 1 to 4 in a multi-page post. I noticed that the entire page reloads every time I advance to the next page of the article. Why does my browser need to reload the page header, footer, and sidebar just to read the next page of a long article? Dan’s Deals has a mobile theme which does a good job of rendering the site on my phone, but it doesn’t seem to address the performance limitations of the mobile experience.

Posted in Uncategorized | Tagged , | Leave a comment

How to Create High Performance HTML5 Apps for ITV

Over the last few months, my company developed an HTML5 video game for several interactive TV (ITV) platforms for a major developer in the ITV gaming space. This project gave us significant insight into how to create high performance apps for ITV.  There are many great articles on how to create high performance HTML5 apps; this one on html5Rocks is a must read. However, very little has been written about how to tame app development on ITV: ITV is a different beast. Hopefully other developers can benefit from our lessons learned. This blog post is only a very rough first draft of what I hope to evolve into a longer, more detailed article.

Analysis | Design | Development | Test | Deployment | Maintenance

Analysis: Know your platform

  • TV browsers are substantially more limited than PC and even mobile browsers in processing power and even more so in memory that’s available to your application
  • You can get a decent idea of your your target TV’s level of support of HTML5 specs by visting html5test from your TV’s browser. Don’t trust the existing specs since results can differ drastically for slightly different models from the same manufacturer.
  • Common tests are just a starting point for discussion. They don’t preclude the need to create small working prototypes that exercise the specific HTML5, javascript, and css capabilities of your app.
  • Write tests that help you understand your TVs memory and processing limits. Don’t assume that you can load more than a few MB of pixels into memory at a one time. Does your TV silently unload images when its memory gets full?

back to top

Design: Keep it simple

  • Are you using canvas? Know when to use it and when not to use it. Minimize the number of canvases, the size, and the duration of any canvas (and any DOM elements). The general rule is, if you can do what you want to do in DOM, do it in DOM; canvas should be viewed as a fallback. canvas was a good choice for our main game board, a variable sized grid of 100+ creatures that could be moved around and animated. We found out the hard way that canvas was not a good choice for many other areas of our game since the additional pixel memory occupied by subsequent canvases overwhelmed the TV browser performance and/or memory limitations.

    galapago level 64

    Galapago in-game experience. Only the main grid use a canvas. The rest of the screen elements are DOM.

  • Minimize your images, both in terms of file sizes but perhaps even more importantly in terms of pixels (width x height). TV’s are often very limited in amount of video memory that they allocate to the browser. Use JPGs for full screen images such as backgrounds and PNGs for all other images. Minimize PNGs with TinyPNG or all images with Yahoo SmushIt.
  • Organize images into sprite sheets in order to minimize the number of files that your app has to load. This design choice can make a significant impact under load. See the article see this article for how use of sprite sheets in one game reduced network bandwidth and loading speed by 10.
  • For the best user experience, your app is probably a one page app. That means that if your app has multiple screens and dialogs, they’ll all need to share javascript window, document, and body objects.
    • DOM element ids should be unique.
    • Events should be attached to the most specific objects possible so that event handlers aren’t stepped on between screens. Even keyboard event handlers can and should be assigned be to specific DOM elements. According to quirksmode, any DOM element that’s not a window element, a hyperlink element, or a form element requires a tabindex property in order to receive focus (and therefore attach events).
    • If your app needs to support animation on multiple screens, your app should have a central mechanism for started, stopping, and pausing animations. For performance reasons, you most likely don’t want animations on one screen to run when another screen is visible.
    • Load images intelligently. Typically HTML5 apps preload all images, but this wasn’t an option for our game since the images occupied too many pixels in memory. We started with a game asset loader that wasn’t created with the limitations of TV browsers in mind. That library got us part of the way to loading our game assets, but it required significant modifications. We had to invent on on-screen image cache that was responsible for loading and unloading images every time a screen or dialog was shown.
    • Does your app support sound? Create a mechanism to globally disable the playing of any sound in your app if necessary. We discovered that some TV browsers crash when trying to play HTML5 audio.

back to top

Development: Chunk It

  • Ensure that a capable developer has access physical access to the target TVs. This kind of development can’t be done remotely!
  • Minimize the feedback loop: can you point the TV browser to your development workstation, or do you have an extra step in your process where you first need to upload your apps to a server?
  • TV browsers don’t come with a fancy development console profiler that displays your app’s memory consumption and helps you identify performance bottlenecks. You could roll your own on-screen console with substantial effort, or better you can use a javascript remote debugging toolkit.

back to top

Test: Automate

  • Mocha.js with chai.js, phantomjs

Back to Table of Contents

Deployment: Automate

  • jake.js, grunt
  • minimize css and js

back to top

Maintenance: User error reports

  • Create a “phone home” for mechanism for your app to upload user error reports to a central location for analysis. You can roll your own or use a javascript error reporting service like Muscula (free as of this writing but converting to a paid service).

back to top

Posted in Uncategorized | Tagged , , | Leave a comment

say goodbye to traditional voicemail

Goodbye Voicemail

Goodbye Voicemail

I’ve been using Google Voice (GV) for several months and I’ve really appreciated it’s features. I usually give people my GV number now and it rings my work phone and my cell phone. I added international minutes and I use it to make international calls from Blackberry (BB) by using the GV app for BB. I’ve used it to seamlessly transfer incoming calls from my work phone to my cell phone when I’m walking out the door. I read voicemails as emails to save time. Today I discovered another great feature: I set up conditional call forwarding on my Sprint BB to forward to my GV number so that I can take advantage of visual voicemail even for people who dial my cellphone directly:

To activate No Answer Call Forwarding:

  • Enter *73
  • Enter the 10-digit phone number to forward the calls to
  • Press Talk
  • Listen for the alert tones that tell you Call Forwarding is activated

To deactivate No Answer Call Forwarding:

  • Enter *730
  • Press Talk
  • Listen for the alert tones that tell you Call Forwarding is deactivated

via Sprint Support

Posted in Uncategorized | Tagged , , , | Leave a comment

Remember What You Learn

Remember What you Learn

I enjoy the ideas and humor in the books REWORK and Getting Real by Jason Fried and David Heinemeier Hansson. When I first noticed REWORK I asked my wife if we could buy a copy.  In her thriftiness she found a copy in audio book form at the local Denver public library branch. G-d bless the library.

REWORK is written in a way that the chapter/section headings serve as great cliff’s notes. Because the book’s points are worth remembering, I started taking audio notes on my phone while I listened in the car. After about two minutes of that, and realizing that I’d probably spend time transcribing my audio notes into written form if I wanted to make them more usable, I figured that I could find the table of contents online. The second Google result led me to the REWORK mindmap. The mindmap is substantially more useful than the static table of contents at Amazon because I can export the mindmap in multiple formats such as rtf or pdf and because signed up for a MindMeister account I can clone the map to my account and add to it as I wish. Thank you MindMeister for creating this slick software and thank you Andy Breeding for saving me the time of using MindMeister to create exactly what I wanted to create!

Posted in Uncategorized | Tagged | Leave a comment

Git Behind Corporate Proxy

corporate security

corporate security

A thread at Stack Overflow explains how to configure git behind a corporate proxy:

git config --global http.proxy http://login:password@our-proxy-server:8088

It’s important to note that if your login has a backslash, as in domain\login, the you must escape the backslash, as in:

git config --global http.proxy http://domain\\\login:password@our-proxy-server:8088

Note the use of three blackslashes.

Posted in Uncategorized | Tagged | 3 Comments

jira and IE6

Google Chrome Frame

Google Chrome Frame

I have the fortune of working on an agile team in not-so-agile company. My company mandates IE6 in order to use it’s intranet (even IE Frame within chrome doesn’t fool our intranet’s browser detection). We use jira to manage our feature backlog. Jira as of version 4 doesn’t support IE6. In particular, the agile jira plugin greenhopper doesn’t play well with IE6 when you try to pop-up an issue. I got excited about a suggestion to use Google chromeframe. So for two days I’ve been struggling to jira and IE6 to work with chromeframe. Rather than

  1. writing a custom servlet plugin or
  2. fronting jira’s tomcat server with apache and leveraging mod_rewrite

I decided to leverage UrlRewriteFilter.

Posted in Uncategorized | Tagged | Leave a comment