Open Data Camp Day 1

If I don’t post a few notes from today’s Open Data Camp now, I never will, so here are a few things I scribbled down- it could be worse, I could have posted a PDF containing photos of the the actual scribbles!

So out of this choice

odcamp-sessions

…I picked, Open Data for Elections, Open Addresses, Data Literacy, Designing Laws using Open Data, and Augmented Reality for Walkers.

Open Data for Elections

I’ve been following @floppy‘s crazy plan to get elected for a while, so this was the easiest decision of the day: what drives someone to embrace the gory inner workings of democracy like this?

Falling turnout it would seem, and concern for a functioning democracy.

The first step of his journey was the Open Politics Manifesto, which I’ve so far failed to edit- must try harder.

Perhaps more interesting was how this, and use of open data, fits into a political platform as a service. It would be nice to have the opportunity to see a few additions to the usual suspects at the ballot box, and Eastleigh got a rare chance to see what that could be like with a by election. Perhaps open data services for candidates could tip he balance enough to encourage more people to stand.

Things that sounded interesting:

  • Democracy Club
  • OpenCorporates
  • Data Packages
  • Open data certificates (food hygiene certificates for data?)
  • Candidates get one free leaflet delivery by Royal Mail- I wonder how big they expect those leaflets to be!

Open Addresses

@floppy and @giacecco introduced the (huge) problems they need to overcome to rebuild a large data set without polluting that data with any sources with intellectual property restrictions. Open Addresses still have a long way to go and there were comments about how long Open Street Map has been around, and it still has gaps.

They have some fun ideas about crowd sourcing address data (high vis jacket required) and there are some interesting philosophical questions around consent for addresses to be added.

It will be interesting to see whether Open Addresses can get enough data to provide real value, and what services they build.

Data Literacy

Mark and Laura led a discussion around data literacy founded in the observation that competent people, with all the skills you could reasonably expect them to have, still struggle with handling data sets.

Who needs to be data literate? Data scientists? Data professionals? Everyone?

Data plumbers? There were some analogies with actual plumbers! You might not be a plumber but it’s useful to know something about it.

If we live in a data driven society, we should know how to ask the right questions. Need domain expertise and technical expertise.

Things that sounded interesting:

Designing Laws using Open Data

@johnlsheridan pointed out that the least interesting thing to do with legislation is to publish it and went on to share some fascinating insights into the building blocks of statute law. It sounds like the slippery language used in legislation boils down to a small number of design patterns built with simple building blocks, such as a duty along with a claim right, and so on.

Knowing these building blocks makes it easier to get the gist of what laws are trying to achieve, helps navigate statutes, and could give policy makers a more reliable way to effect a goal.

For example, it’s easier to make sense of the legislation covering supply of gas, and it’s possible to identify where there may be problems. The gas regulator has a duty to protect the interests of consumers by promoting competition, but that’s a weak duty without a clear claim right to enforce it.

John also demonstrated a tool – http://ngrams.elasticbeanstalk.com – exploring how the language used in legislation has changed over time, for example how the use of “shall” has declined and been replaced by “is to be”.

Augmented Reality for Walkers

My choice of Android tablet was largely based on what might work reasonably well for maps and augmented reality, so I seized this opportunity!

Nick Whitelegg described the Hikar Android app he’s been working on, which is intended to help hikers follow paths by overlaying map data on a live camera feed.

The data is a combination of Open Street Map mapping data, with Ordnance Survey height data, which is downloaded and cached as tiles around your current location. Open GL is used to overlay a 3D view of the map data on the live camera feed, using the Android sensor APIs to detect the device’s rotation.

I’ve just downloaded and installed Hikar and, while my tablet is a tad slow, it works really well. I live somewhere flat and boring but the height data made a noticeable difference when Nick demonstrated the app in hilly Winchester.

Still to come: Day 2!


Monki Gras 2015

Monki Gras happened again! Though, in its Monki Gras 2015 incarnation, it acquired a heavy metal umlaut and a ‘slashed zero’ in its typeface; an allusion to its Nordic nature: Mönki Gras 2Ø15

What is Monki Gras?

Well…

And Ricardo makes a good point, explaining why I, and others, just keep going back:

There’s a single track of talks so you are saved the effort of making decisions about what to see and you can just focus on listening. The speakers entertain as well as inform, which, I really like.

While it is a tech conference, there’s little code because it’s about making technology happen rather than the details of the technology itself. So there are talks on developer culture, design, and data, as well as slightly more off-the-wall things to keep our brains oiled.

In James’ very own distinctive words:

Why go all Nordic this year?

All the speakers this year were Scandinavian in some way. It was probably the most rigorously applied conference theme I’ve ever seen (mostly, conferences come up with a ‘theme’ for marketing purposes which usually gets mostly forgotten about by the time of the conference itself).

James talks a bit more about this on the Monkigras blog. A surprising amount of tech we know and love comes out of the relatively sparsely populated Scandinavian countries. For example:

And, apparently, Finland leads the EU in enterprise cloud computing:

Are the Nordics really that different from anywhere else?

Well, this graph seems to say they are, if only for their taste in music:

Which suggests there is at least something different about Nordic cultures from the rest of Europe, let alone the world.

So several of the speakers delved into why they thought this led to success in technology innovation and development. For example, there’s the attitude to not recognising when you’re failing and giving up so that you can be successful by doing it another way:

A Swedish concept, lagom, which means ‘just the right amount’ was credited with the popularity of the cloud in the Nordics. And, indeed, with pretty much anything we could think of throughout the rest of the event.

Similarly, you could argue that lagom is why Docker is popular among developers:

One fascinating talk, by a Swedish speaker based in Silicon Valley, was about the difference between startups in the Nordics and Silicon Valley. For example, the inescapable differences between their welfare systems were credited as being responsible for different priorities regarding making money. (Hopefully, videos of the talks will be put online and I’ll add a link to it.)

Obviously, all this talk about culture can, and did, drift into stereotyping. I did get slightly weary of the repeated comparisons between cultures, though interesting and, often, humorous.

Developer culture

One of the things I’m most interested in is hearing what other companies have learnt about developer culture and community. For example:

There’s more about this talk on Techworld. And Spotify have blogged some funky videos about the developer culture they aspire to (part 1 and part 2), which are well worth watching if you work in software development.

Something that I’m working on at IBM is increasing the openness of our development teams so, again, I’m always interested in new ways to do this. This is something that Sweden (yes, the country!) has adopted to a surprising extent:

Innovation and inefficiencies

One important message that came across at Monki Gras 2015 was that you have to allow time for innovation to happen. It’s when things seem inefficient and time is not allocated to a specific activity that innovation often occurs.

A nice example of this is the BrewPi project. At Monki Gras 2013, Elco Jacobs talked about his open source project of brewing beer and using a Raspberry Pi to monitor it:

I bumped into him this year and what had been a project now occupies him full-time as a small business selling the technology to brewers around the world. A pause in his education when he had nothing better to do had enabled him to get on with his BrewPi project and, after graduation, turn it into a business.

Data journalism

There’s a lot talked about open data and how we should be able to access tax-funded data about things that affect our lives. The Guardian is taking a lead with data journalism and Helena Bengtsson gave a talk about how knowing how to navigate large data sets to find meaning was vital to finding stories in the Wikileaks data.

She started out in data journalism in Sweden where, in one case, she acquired and mapped large data sets that revealed water pollution problems around the country, which triggered a several stories.

It’s not just having the data that matters but the interpretation of the data. That’s what data journalism gives us over just ‘big data':

Also, I found out a fascinating fact:

Anyway, that’s about as much as I can cram in. We also found out random things about Scandinavian knitwear and the fact that Sweden has its own official typeface, Sweden Sans. And we ate lots of Nordic foods, drank Nordic beer and (some of us) drank Akvavit. And, most importantly, we talked to each other lots.

The thing I really value about Monki Gras (on top of the great talks, food, drink, and fun atmosphere) is the small size of the event and all the interesting people to talk to. That’s why I keep going back.

P.S. A good write-up of the talks

The post Monki Gras 2015 appeared first on LauraCowen.co.uk.

How to use the IBM Watson Relationship Extraction service on Bluemix

Before Christmas, I wrote about how I used the Watson Relationship Extraction service on Bluemix to pick out the things mentioned in news stories, as part of a mobile app we built on a hackday. I’d still like to do something more with that app, but in the meantime I should at least share how I did the Relationship Extraction bit.

From the official doc for the service:

From unstructured text, Relationship Extraction can extract entities (such as people, locations, organizations, events), and the relationships between these entities (such as person employed-by organization, person resides-in location).

This is provided as a hosted service on IBM Bluemix where any developer can sign up and give it a try.

It’s available as a documented REST API, but as part of using it in the hackday, I needed to write a bit of code around that, just to prepare the request and parse the response. I think it’ll save me time to reuse this the next time I want to build something with the API, so I’m sharing it as a standalone package.

In this post, I’ll walk though how you can use it, with a small app that grabs the contents of a BBC News story and picks out the names of people mentioned in the story.

First, a simpler example. Consider this exciting text:

Dale Lane works as a developer for IBM. He started in 2003. Dale lives in the UK, in a town called Eastleigh. Before that, he was a student at the University of Bath.

A few lines of Javascript are enough to run that through the service.

var text = '';
var watson = require('extract-relationships');
watson.extract(text, function(err, response) {
    // response has got all the info  
});

The full contents of response is in a gist if you want to see it, but I’ll show just a few examples here to give you the idea.

screenshot

It has picked out all of the references to me, recognising that they are all describing a person, and that ‘developer’ is my occupation.

The ‘begin’ and ‘end’ numbers tell you where in the text each bit was found.

screenshot

It’s picked out the reference to IBM, and recognised that this is a name of a commercial organisation.

screenshot

It’s recognised that ‘2003’ is a reference to a date.

As well as identifying those entities and many others, it’s also picked out the relationships between them.

screenshot

For example, it’s identified the relationship between me and IBM.

screenshot

And the relationship between me and my old University.

I’ve written a more detailed breakdown of what is contained in the response including how to find out what each of the fields mean, and what the different possible values for each one are.

That’s the basics with a few input sentences. Next, we start throwing a lot of text at it.

In about thirty lines of Javascript, you can download the text from a news story on the BBC News website, and pick out the names of all of the people mentioned in the story.

If you run that simple example, you get the list of people that are included in the story text.

Where it starts to get interesting is when you combine this with other sources and APIs.

For example, once you’ve picked out the names of people from the story, try looking up their profiles on Wikipedia, and finding out who they are.

Or, instead of people, pick out the names of places from news stories, and use a geocoding API to plot them on a map. (There are geocoding services available on Bluemix, too, if that helps.)

Hopefully you can see how you could start to use this in your own apps.

Finally, some practical points.

How do you install the package I’ve shared so you can use it?

npm install --save extract-relationships

How do you configure it for the Watson Relationship Extraction Service?

The API we’re using is an authenticated service hosted in IBM Bluemix, so there is a tiny bit of config you need to do first before you can use it.

If you’re running your app in Bluemix, then there isn’t much to do. Add the Relationship Extraction service to your app from the Bluemix dashboard, and the endpoint and credentials will automatically be provided and should just work.

If you’re running your app on your own machine, there are a couple of extra steps instead.

Go to Bluemix. Sign up for an account if you haven’t already got on

screenshot

From the dashboard, create an app. You need something as a placeholder to bind the Relationship Extraction service to, even if you don’t use it.

screenshot

Create a web app and give it a name.

screenshot

Add a service.

screenshot

Choose the Relationship Extraction service from the group of Watson services.

screenshot

Click on ‘Show Credentials’. Everything you need to configure your app is in here. You need the url, username and password.

screenshot

Copy this into an options object like this:

var options = {
    api : {
        url : 'https://url.of.your.watson.service...',
        user : 'your-watson-service-username',
        pass : 'your-watson-service-password'
    }
};

To reiterate, this isn’t the username and password that you use to sign in to Bluemix. It’s the username and password specifically for this service that Bluemix has generated for your app.

It’s not a good idea to hard-code passwords in your code, so I’d suggest putting them outside of your app and grabbing them in when needed. Environment variables are an easy way to do this, and are what I’ve done in the few samples I’ve written.

That’s all there is to it.

I’ve put more info on github with the source for how I’m using the API.


Watson News Companion

newscompanion screenshotWe recently ran a hackathon at work: people within IBM were invited to try building a mobile app aimed at consumers using Watson services. It was a fun chance to try out some new ideas, as well as to build something using our APIs – dogfooding is always a good thing.

I worked on a hack with David which we submitted on Wednesday. This is what we came up with, and how we built it.

The idea

A mobile app that will help users to digest the news by explaining references in stories and providing greater context.

Background

It’s difficult to find the time nowadays to properly read and understand what’s going on in the world. We rarely have the time to sit and read through a newspaper. Instead, we might quickly read news stories online from our smartphones and tablets. But that often makes it difficult to understand the broader context that a story is in. There might be references in the story to people, places, organisations or events that are unfamiliar.

Watson could help. It could be an assistant as you read the news, explaining unfamiliar references and the broader context.

Features

Our Watson News Companion demo is a mobile news reader app that:

  • anticipates questions and suggests areas where it can help improve understanding
  • provides answers to questions without needing the users to lose their place in the story
  • allow the user to dig deeper with their own follow-up questions


A video walkthrough of the hack

Implementation

The hack was built as a mobile web app using the MEAN stack: using express as the framework on a Node.js platform, storing some information in a MongoDB and building the UI with AngularJS.

It uses RSS feeds from news websites to fetch content, which are shown in a simple newsreader app built using Ratchet.

The contents of the story is run through the Watson Relationship Extraction API to pick out the people, places, organisations, and other entities mentioned in the story.

The API output includes co-references, to identify the multiple mentions of the same entity. These are combined and reviewed, and together with the type of the entity are used to generate likely questions about the entity.

These questions are sent to the Watson Question and Answer API. For questions which are returned with answers with a high confidence, annotations are added inline to the news story. Pressing the annotation brings up a sliding panel at the bottom of the screen with the answer to the question. The links and footer annotations are built using bigfoot.

Every screen in the app also includes an “Ask Watson” button which lets the user enters any free text question to let them dig deeper into what they’ve read.

Could you build this?

This was a proof-of-concept built in a hurry, so we’re not calling this a finished app. But everything we used is freely available – both to people inside IBM and the public.

We developed on an instance of Bluemix (our Cloud Foundry-based development platform) available internally within IBM. You can sign up for free to a public instance of Bluemix at bluemix.net.

The technologies used to build the hack are all freely available : Node.js, MongoDB, AngularJS, Ratchet, bigfoot, jQuery.

A beta version of the Watson Relationship Extraction API is freely available for apps hosted on Bluemix.

A beta version of the Watson Question and Answer API is freely available for apps hosted on Bluemix. But this is a demo instance of Watson that has only read a small number of general healthcare documents. That’s not a useful corpus for our hack, so to record our demo we stood up our own instance – using an untrained instance of Watson which we gave a small subset of Wikipedia to read. We used the Question and Answer API on this instance instead of the Bluemix one. For people outside IBM to do this, they need to sign up to join the Watson Ecosystem. This is also free, but there are criteria for who is eligible at this point, and an application process to go through.


A Conversational Internet of Things – ThingMonk talk

Earlier this year, Tom Coates wrote a blog post about his session at this year’s O’Reilly Foo Camp. Over tea with colleagues, we talked about some of the ideas from the post and how some of our research work might be interesting when applied to them.

One thing led to another and I found myself talking about it at ThingMonk this year. What follows is a slightly expanded version of my talk.


ciot-0

ciot-2

Humanising Things

We have a traditional of putting human faces on things. Whether it’s literally seeing faces on the Things in our everyday lives, such as the drunk octopus spoiling for a fight, or possibly the most scary drain pipe ever.

Equally, we have a tendency to put a human persona onto things. The advent of Twitter brought an onslaught of Things coming online. It seemingly isn’t possible for me to do a talk without at least a fleeting mention of Andy Standford-Clark’s twittering ferries; where regular updates are provided for where each ferry is.

ciot-5

One of the earliest Things on Twitter was Tower Bridge. Tom Armitage, who was working near to the bridge at the time, wrote some code that grabbed the schedule for the bridge opening and closing times, and created the account to relay that information.

ciot-6

One key difference between the ferries and the bridge is that the ferries are just relaying information, a timestamp and a position, whereas the bridge is speaking to us in the first-person. This small difference immediately begins to bring a more human side to the account.
But ultimately, they are simple accounts that relay their state with whomever is following them.

This sort of thing seems to have caught on particularly with the various space agencies. We no longer appear able to send a robot to Mars, or land a probe on a comet without an accompanying twitter account bringing character to the events.

ciot-7

There’s always a sense of excitement when these inanimate objects start to have a conversation with one another. The conversations between the philae lander and its orbiter were particularly touching as they waved goodbye to one another. Imagine, the lander, which was launched into space years before Twitter existed, chose to use its last few milliamps of power to send a final goodbye.

ciot-8

But of course as soon as you peek behind the curtain, you see someone running Tweetdeck, logged in and typing away. I was watching the live stream as the ESA team were nervously awaiting to hear from philae. And I noticed the guy in the foreground, not focused on the instrumentation as his colleagues were, but rather concentrating on his phone. Was he the main behind the curtain, preparing Philae’s first tweet from the surface? Probably not, but for the purposes of this talk, let’s pretend he was.

ciot-9

The idea of giving Things a human personality isn’t a new idea. There is a wealth of rigorous scientific research in this area.

One esteemed academic, Douglas Adams, tells us about the work done by the The Sirius Cybernetics Corporation, who invented a concept called Genuine People Personalities (“GPP”) which imbue their products with intelligence and emotion.

He writes:

Thus not only do doors open and close, but they thank their users for using them, or sigh with the satisfaction of a job well done. Other examples of Sirius Cybernetics Corporation’s record with sentient technology include an armada of neurotic elevators, hyperactive ships’ computers and perhaps most famously of all, Marvin the Paranoid Android. Marvin is a prototype for the GPP feature, and his depression and “terrible pain in all the diodes down his left side” are due to unresolved flaws in his programming.

In a related field, we have the Talkie Toaster created by Crapola, Inc and seen aboard Red Dwarf. The novelty kitchen appliance was, on top of being defective, only designed to provide light conversation at breakfast time, and as such it was totally single-minded and tried to steer every conversation to the subject of toast.

ciot-13

Seam[less|ful]ness

In this era of the Internet of Things, we talk about a future where our homes and workplaces are full of connected devices, sharing their data, making decisions, collaborating to make our lives ‘better’.

Whilst there are people who celebrate this invisible ubiquity and utility of computing, the reality is going to much more messy.

Mark Weiser, Chief Scientist at Xerox PARC, coined the term “ubiquitous computing” in 1988.

Ubiquitous computing names the third wave in computing, just now beginning. First were mainframes, each shared by lots of people. Now we are in the personal computing era, person and machine staring uneasily at each other across the desktop. Next comes ubiquitous computing, or the age of calm technology, when technology recedes into the background of our lives.

Discussion of Ubiquitous Computing often celebrated the idea of seamless experiences between the various devices occupying our lives. But in reality, Mark Weiser advocated for the opposite; that seamlessness was undesirable and self-defeating attribute of such a system.

He preferred a vision of “Seamfulness, with beautiful seams”

ciot-15

The desire to present a single view of the system, with no joins, is an unrealistic aspiration in the face of the cold realities of wifi connectivity, battery life, system reliability and whether the Cloud is currently turned on.

Presenting a user with a completely monolithic system gives them no opportunity to connect with and begin to understand the constituent parts. That is not it say this information is needed to all users all of the time. But there is clearly utility to some users some of the time.

When you come home from work and the house is cold, what went wrong? Did the thermostat in the living room break and decide it was the right temperature already? Did the message from the working thermostat fail to get to the boiler? Is the boiler broken? Did you forgot to cancel the entry in your calendar saying you’d be late home that day?

Without some appreciation of the moving parts in a system, how can a user feel any ownership or empowerment when something goes wrong with it. Or worse yet, how can they avoid feeling anything other than intimidated by this monolithic system that simply says “I’m Sorry Dave, I’m afraid I can’t do that”.

Tom Armitage wrote up his talk from Web Directions South and published it earlier this week, just as I was writing this talk. He covers a lot of what I’m talking about here so much more eloquently than I am – go read it. One piece his post pointed me at that I hadn’t seen was Techcrunch’s recent review of August’s Smart Lock.

ciot-16

Tom picked out some choice quotes from the review which I’ll share here:

“…much of the utility of the lock was negated by the fact that I have roommates and not all of them were willing or able to download the app to test it out with me […] My dream of using Auto-Unlock was stymied basically because my roommates are luddites.”

“Every now and then it didn’t recognize my phone as I approached the door.”

“There was also one late night when a stranger opened the door and walked into the house when August should have auto-locked the door.”

This is the reason for having beautiful seams; seams help you understand the edges of a devices sphere of interaction, but should not be so big to trip you up. Many similar issues exists with IP connected light bulbs. When I need to remember which app to launch on my phone depending on which room I’m walking into, and which bulbs happen to be in there, the seams have gotten too big.

In a recent blog post, Tom Coates wrote about the idea of a chatroom for the house – go read it.

Much like a conference might have a chatroom, so might a home. And it might be a space that you could duck into as you pleased to see what was going on. By turning the responses into human language you could make the actions of the objects less inscrutable and difficult to understand.

ciot-17

This echoes back to the world of Twitter accounts for Things. But rather than them being one-sided conversations presenting raw data in a more consumable form, or Wizard-of-Oz style man-behind-the-curtain accounts, a chatroom is a space where the conversation can flow both ways; both between the owner and their devices, but also between the devices themselves.

What might it take to turn such a chatroom into a reality?

ciot-18

Getting Things Talking

Getting Things connected is no easy task.

We’re still in the early days of the protocol wars.

Whilst I have to declare allegiance to the now international OASIS standard MQTT, I’m certainly not someone who thinks one protocol will rule them all. It pains me whenever I see people make those sorts of claims. But that’s a talk for a different day.

Whatever your protocol of choice, there are an emerging core set that seem to be the more commonly talked about. Each with its strengths and weaknesses. Each with its backers and detractors.

ciot-19

What (mostly) everyone agrees on is the need for more than just efficient protocols for the Things to communicate by. A protocol is like a telephone line. It’s great that you and I have agreed on the same standards so when I dial this number, you answer. But what do we say to each other once we’re connected? A common protocol does not mean I understand what you’re trying to say to me.

And thus began the IoT meta-model war.

There certainly a lot of interesting work being done in this area.

For example, HyperCat, a consortium of companies coming out of a Technology Strategy Board funded Demonstrator project in the last year or so.

ciot-21

HyperCat is an open, lightweight JSON-based hypermedia catalogue format for exposing collections of URIs. Each HyperCat catalogue may expose any number of URIs, each with any number of RDF-like triple statements about it. HyperCat is simple to work with and allows developers to publish linked-data descriptions of resources.

URIs are great. The web is made of them and they are well understood. At least, they are well understood by machines. What we’re lacking is the human view of this world. How can this well-formed, neatly indented JSON be meaningful or helpful to the user who is trying to understand what is happening.

This is by no means a criticism of HyperCat, or any of the other efforts to create models of the IoT. They are simply trying to solve a different set of problems to the ones I’m talking about today.

ciot-23

Talking to Computers

We live in an age where the talking to computers is becoming less the reserve of science fiction.

Siri, OK Google, Cortana all exist as ways to interact with the devices in your pocket. My four year old son walks up to me when I have my phone out and says: “OK Google, show me a picture of the Octonauts” and takes over my phone without even having to touch it. To him, as to me, voice control is still a novelty. But I wonder what his 6 month old sister will find to be the intuitive way of interacting with devices in a few years time.

The challenge of Natural Language Parsing, NLP, is one of the big challenges in Computer Science. Correctly identifying the words being spoken is relatively well solved. But understanding what those words mean, what intent they try to convey, is still a hard thing to do.

To answer the question “Which bat is your favourite?” without any context is hard to do. Are we talking to a sportsman with their proud collection of cricket bats? Is it the zoo keeper with their colony of winged animals. Or perhaps a comic book fan being asked to chose between George Clooney and Val Kilmer.

Context is also key when you want to hold a conversation. The English language is riddled with ambiguity. Our brains are constantly filling in gaps, making theories and assertions over what the other person is saying. The spoken word also presents its own challenges over the written word.

ciot-28

“Hu was the premiere of China until 2012″

When said aloud, you don’t know if I’ve asked you a question or stated a fact. When written down, it is much clearer.

ciot-29

In their emerging technology report for 2014, Gartner put the Internet of Things at the peak of inflated expectation. But if you look closely at the curve, up at the peak, right next to IoT, is NLP Question Answering. If this was a different talk, I’d tell you all about how IBM Watson is solving those challenges. But this isn’t that talk.

ciot-30

A Conversational Internet of Things

To side step a lot of the challenges of NLP, one area of research we’re involved with is that of Controlled Natural Language and in particular, Controlled English.

CE is designed to be readable by a native English speaker whilst representing information in a structured and unambiguous form. It is structured by following a simple but fully defined syntax, which may be parsed by a computer system.

It is unambiguous by using only words that are defined as part of a conceptual model.

CE serves as a language that is both understandable by human and computer system – which allows them to communicate.

For example,

there is a thermometer named t1 that is located in the room r1

A simple sentence that establishes the fact that a thermometer exists in a given room.

the thermometer t1 can measure the environment variable temperature

Each agent in the system builds its own model of the world that can be used to define concepts such thermometer, temperature, room and so on. As the model is itself defined in CE, the agents build their models through conversing in CE.

there is a radiator valve v1 that is located in the room r1
the radiator valve v1 can control the environment variable temperature

It is also able to using reasoning to determine new facts.

the room r1 has the environment variable temperature that can be measured and that can be controlled

As part of some research work with Cardiff University, we’ve been looking at how CE can be extended to a conversational style of interaction.

These range from exchanging facts between devices – the tell

the environment variable temperature in room r1 has value "21"

Being able to ask question – ask-tell

for which D1 is it true that
      ( the device D1 is located in room V1 ) and
      ( the device D1 can measure the environment variable temperature ) and
      ( the value V1 == "r1")

Expanding on and explaining why certain facts are believed to be true:

the room r1 has the environment variable temperature that can be measured and that can be controlled
    because
the thermometer named t1 is located in the room r1 and can measure the environment variable temperature
    and
the radiator valve v1 is located in the room r1 and can control the environment variable temperature

The fact that the devices communicate in CE means the user can passively observe the interactions. But whilst CE is human readable, it isn’t necessarily human writeable. So some of the research is also looking at how to bridge from NL to CE using a confirm interaction:

NL: The thermometer in the living room has moved to the dining room
CE: the thermometer t1 is located in the room r2

Whilst the current research work is focused on scenarios for civic agencies – for example managing information exchange in a policing context, I’m interested in applying this work to the IoT domain.

With these pieces, you can begin to see how you could have an interaction like this:

    User: I will be late home tonight.
    House: the house will have a state of occupied at 1900
    User: confirmed
    House: the room r1 has a temperature with minimum allowable value 20 after time 1900
           the roomba, vc1, has a clean cycle scheduled for time 1800

Of course this is still quite dry and formal. It would be much more human, more engaging, if the devices came with their own genuine people personality. Or at least, the appearance of one.

    User: I will be late home tonight.
    House: Sorry to hear that, shall I tell everyone to expect you back by 7?
    User: yes please    
    Thermometer: I'll make sure its warm when you get home
    Roomba: *grumble*

I always picture the Roomba as being a morose, reticent creature who really hates its own existence. We have one in the kitchen next to our lab at work, set to clean at 2am. If we leave the door to the lab open, it heads in and, without fail, maroons itself on a set of bar stools we have with a sloped base. Some might call that a fault in its programming, much like Marvin, but I like to think its just trying to find a way to end it all.

This is all some way from having a fully interactive chat room for your devices. But the building blocks are there and I’ll be exploring them some more.

Tackling Cancer with Machine Learning

For a recent Hack Day at work I spent some time working with one of my colleagues, Adrian Lee, on a little side project to see if we could detect cancer cells in a biopsy image.  We've only spent a couple of days on this so far but already the results are looking very promising with each of us working on a distinctly different part of the overall idea.

We held an open day in our department at work last month and I gave a lightening talk on the subject which you can see on YouTube:


There were a whole load of other talks given on the day that can be seen in the summary blog post over on the ETS (Emerging Technology Services) site.



Went to Designcamp, got the T-Shirt

Last month, I was fortunate enough to fly off to Austin with a group of colleagues for a week long IBM Design Thinking camp. It was an opportunity to get away from the day job, with laptops all-but banned, and have a deep-dive into what IBM Design is about and how it can be applied.

As a relatively new effort within the company, IBM Design sets out to bring a focus back to where it should be; the human-experience of our products and services. This isn’t just about making pretty user interfaces; it is the entire experience of our products.

As an engineer, the temptation is always there to create shiny new features. But no matter how shiny it is, if it isn’t what a user needs, then it’s a waste of effort. The focus has to be on what the user wants to be able to do. This is something I’ve always tried to do with Node-RED; we often get suggestions for features that, once you start picking at them, are really solutions looking for a problem. Once you work back and identify the problem, we’re often able to identify alternative solutions that are even better.

P1070048

It’s often just a matter of asking the right question; At Designcamp, the very first exercise we were asked to do was to draw a new type of vase. Everyone drew something that looked vaguely vase-like. Then (spoilers…) we were asked to draw a better way to display flowers. At this point we got lots of decidedly un-vase-like ideas that were much more imaginative. It’s the difference between asking for a feature and asking for an idea. The former presupposes a lot about the nature of the answer, the latter is focused on not just the what, but also the why.

This relentless focus on the user isn’t a new idea. GDS, who are doing incredible things with government services, have it as their very first Design Principle. But it is refreshing to see this focus being brought to bear within a transformation of how the entire company operates.

Oh, and of course being in Austin, we got to screen print our own IBM Designcamp T-Shirts to commemorate the visit.

Go to Designcamp, screen print your own t-shirt. Obvs.

Lots more photos from the week over on flickr.

Hursley 3D Printing Expo

D’oh, looks like I missed a swarm of 3d printers in Hursley recently! I wonder if anyone has printed a model of the house/site yet.

I’m still looking for even a vaguely plausible excuse to splash out on a 3d printer, but printing models or new 3d printers still isn’t quite enough to justify the money (or space these days)!


Kids should learn to code

Does a five-year-old need to learn how to code?

A couple of weeks ago I was interviewed by the BBC. In a fairly long phone call, I either rambled inanely or provided detailed and nuanced answers in context. That depends on your point of view.

Either way, obviously not a lot of it could make it into their story, as they really only needed a few quotes. So I thought I’d put more of what I said here.

The background for the story was the changes to the UK school curriculum which means that all kids are being taught to code. And the basic premise for the piece was that as we’re “entering an era when computers are actually beginning to teach themselves” that this is unnecessary and that coding itself is becoming an outdated skill.

This is a summary of what I tried to say…

Learning to “code”

It’s useful to start with some context. When we talk about teaching kids to “code” we don’t just mean teaching them how to write lines of code – it’s broader than that. Some criticisms of this initiative seem to be arguing against five-year olds needing to learn where to put semi-colons, which is missing the point.

From what I’ve seen, it’s an umbrella term that covers a range of activities such as:

Logical thinking and problem solving

Teaching kids how to understand a description of a problem, identify a solution, and describe that solution by breaking it down into a series of steps.

As kids get older this can be framed as how to write an algorithm. But it’s something that can be started even at Faith’s age (6) and without needing to touch a keyboard. That’s not new – how many developers have had to answer the interview question “describe how to make a cup of tea”?

You don’t need to learn programming language syntax to start getting your head around this, and I would argue it’s a vital skill to develop in life, even if you don’t become a coder.

Technological creativity

We need to do more than teach children how to use the tools that they have today. We need to encourage an ethos from an early age that we don’t have to be passive users of technology.

It’s about teaching kids how to think of and how to approach technology. They don’t have to think of it as a black box that must be used as-is, but as something that they can remix and tweak and modify and change and create. It’s about an attitude of looking at technology as something that they can make do what they want to do, as opposed to use the way someone tells them they should.

This is what I love about running my Code Club. Instead of kids playing a random Flash game they find online, they can make a game themselves, the way they want it to be. If they want it to be faster, slower, bigger, smaller, a different colour, move differently: they are in control. It’s not fixed, they can make it do and behave the way they want it to. And if they realise that they can do that with technology, it’s a real light-bulb moment.

We need kids to have this mindset so they will grow up able to imagine the next wave of innovations. Saying that we don’t need this because we can delegate it to the computers we have today really feels to me to be missing the point. Cognitive computing holds exciting promise and potential but it does not mean “we won’t need to be creative any more, the computers will do that for us, too”.

Coding becoming “outdated”

Leaving aside this bigger picture, is coding itself a useful skill to learn. Is coding going to become outdated?

I don’t think so.

Part of this argument seemed to be “what is the point of teaching kids <insert-name-of-programming-language-here> because by the time they grow up it will be obsolete?”

Programming languages stick around longer than people think – there are people still making a living writing C and maintaining COBOL. (We’re normally after good Prolog people, too!)

But more importantly, a lot of what you learn in one language is transferable. Every time I’ve started working in a new programming language, I’ve built on the basic concepts I already know from others. Maybe we’ll teach children a programming language that isn’t the most widely used language when they’re older. But that doesn’t mean learning the underlying ideas will have been a waste of time.

The argument also seemed to be that not just any particular language, but coding in general will become obsolete. I’m not convinced by this.

What we mean by coding may be different in twenty years to what we mean today. In fact it probably will be. Coding will evolve. It always has, and I’m sure it will continue to.

Even just looking at my personal coding history, you can see that evolution. Writing in assembler (where I was moving data in and out of registers) was different to writing in C. And writing in C (where it wasn’t just about what I wanted it to do functionally, but also doing my own memory management) was different to my coding today in Java.

A big difference is in the level of abstraction. They all involved describing to the computer something that I wanted it to do. But the level of abstraction I’m able to use to describe it has changed.

I’m sure this is a trend that will continue. New programming languages will get higher and higher level. Future programming languages will give us ways to describe what we want with higher levels of abstraction. And maybe that will look closer to natural language than what we have today (well-written Java is already closer to being readable by a lay-person than assembler). Maybe it will be something like a Controlled English language that feels more like describing what you want to another person.

But that won’t mean that coding has become obsolete, just that it will have evolved as it always has.

The need for people who can understand a problem, and describe to a computer how to solve it, will remain – whatever language they use and whether that language looks like “code” as we understand it today.


Unpacking binary data from MQTT in Javascript

While doing trawl of Stackoverflow for questions I might be able to help out with I came across this interesting looking question:

Receive binary with paho mqttws31.js

The question was how to unpack binary MQTT payloads into double precision floating point numbers in javascript when using the Paho MQTT over WebSockets client.

Normally I would just send floating point numbers as strings and parse them on the receiving end, but sending them as raw binary means much smaller messages, so I thought I’d see if I could help to find a solution.

A little bit of Googling turned up this link to the Javascript typed arrays which looked like it probably be in the right direction. At that point I got called away to look at something else so I stuck a quick answer in with a link and the following code snippet.

function onMessageArrived(message) {
  var payload = message.payloadByte()
  var doubleView = new Float64Array(payload);
  var number = doubleView[0];
  console.log(number);
}

Towards the end of the day I managed to have a look back and there was a comment from the original poster saying that the sample didn’t work. At that point I decided to write a simple little testcase.

First up quick little Java app to generate the messages.

import java.nio.ByteBuffer;
import org.eclipse.paho.client.mqttv3.MqttClient;
import org.eclipse.paho.client.mqttv3.MqttException;
import org.eclipse.paho.client.mqttv3.MqttMessage;

public class MessageSource {

  public static void main(String[] args) {
    try {
      MqttClient client = new MqttClient("tcp://localhost:1883", "doubleSource");
      client.connect();

      MqttMessage message = new MqttMessage();
      ByteBuffer buffer = ByteBuffer.allocate(8);
      buffer.putDouble(Math.PI);
      System.err.println(buffer.position() + "/" + buffer.limit());
      message.setPayload(buffer.array());
      client.publish("doubles", message);
      try {
        Thread.sleep(1000);
      } catch (InterruptedException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
      }
      client.disconnect();
    } catch (MqttException e) {
      // TODO Auto-generated catch block
      e.printStackTrace();
    }
  }
}

It turns out that using typed arrays is a little more complicated and requires a bit of work to populate the data structures properly. First you need to create an ArrayBuffer of the right size, then wrap it in a Uint8Array in order to populate it, before changing to the Float64Array. After a little bit of playing around I got to this:

function onMessageArrived(message) {
  var payload = message.payloadBytes
  var length = payload.length;
  var buffer = new ArrayBuffer(length);
  uint = new Uint8Array(buffer);
  for (var i=0; i<length; i++) {
	  uint[i] = payload[i];
  }
  var doubleView = new Float64Array(uint.buffer);
  var number = doubleView[0];
  console.log("onMessageArrived:"+number);
};

But this was returning 3.207375630676366e-192 instead of Pi. A little more head scratching and the idea of checking the byte order kicked in:

function onMessageArrived(message) {
  var payload = message.payloadBytes
  var length = payload.length;
  var buffer = new ArrayBuffer(length);
  uint = new Uint8Array(buffer);
  for (var i=0; i<length; i++) {
	  uint[(length-1)-i] = payload[i];
  }
  var doubleView = new Float64Array(uint.buffer);
  var number = doubleView[0];
  console.log("onMessageArrived:"+number);
};

This now gave an answer of 3.141592653589793 which looked a lot better. I still think there may be a cleaner way to do with using a DataView object, but that’s enough for a Friday night.

EDIT:

Got up this morning having slept on it and came up with this:

function onMessageArrived(message) {
  var payload = message.payloadBytes
  var length = payload.length;
  var buffer = new ArrayBuffer(length);
  uint = new Uint8Array(buffer);
  for (var i=0; i<length; i++) {
	  uint[i] = payload[i];
  }
  var dataView = new DataView(uint.buffer);
  for (var i=0; i<length/8; i++) {
      console.log(dataView.getFloat64((i*8), false));
  }
};

This better fits the original question in that it will decode an arbitrary length array of doubles and since we know that Java is big endian, we can set the little endian flag to false to get the right conversion without having to re-order the array as we copy it into the buffer (which I’m pretty sure wouldn’t have worked for more than one value).