Thursday 19 September 2019

Swarm algorithms

I wrote a book about about genetic algorithms and machine learning. You can buy it here.




Apart from genetic algorithms and other aspects of machine learning, it includes some swarm algorithms. Where a genetic algorithm mixes up potential solutions, by merging some together, and periodically mutates some values, swarm algorithms can be regarded as individual agents collaborating, each representing a solution to a problem, They can work together in various ways, giving rise to a variety of swarm algorithms.  

The so-called particle swarm algorithm can be used to find optimal solutions to problems. It's commonly referred to as a particle swarm optimisation, or PSO for short. PSO is often claimed to be based on the flocking behaviour of birds. Indeed, if you get the parameters right, you might see something similar to a flock of birds. PSO are similar to colony algorithms, which are also nature inspired, and also having agents collaborating to solve a problem.

Suppose you have some particles in a paper bag, say somewhere near the bottom. If they move about at random, some might get out of the bag in the end. If they follow each other, they might escape, but more likely than not, they'll hang round together in a gang. By providing a fitness function to encourage them, they can learn, for some definition of learn. Each particle can assess where it is, and remember the better places. The whole swarm will have a global best too. To escape a paper bag, we want the particles to go up. By inspecting the current (x, y) position, the fitness score can be the y-value. The bigger, the better. For real world problems, there can be many more than two dimensions, and the fitness function will require some thought. 

The algorithms is as follows:

    Choose n
    Initialize n particles randomly
    For a while:
        Update best global position
        Move particles
        Update each particle's best position and velocity

The particles' personal bests and the overall global best give the whole swarm a memory, of sorts. Initially, this is the starting position for each particle. In addition to the current position, each particle has a velocity, initialised with random numbers. Since we're doing this in two dimensions, the velocity has an x component, and a y component. To move a particle, update each of these by adding the velocity, v, in that direction to the current position:

    xt+1 = xt + vx,t
    yt+1 = yt + yx,t

Since the velocity starts at random, the particles move in various different directions to begin with.  The trick comes in when we update the velocity. There are several ways to do this. The standard way adds a fraction of the distance between the personal best and global best position for each particle and a proportion of the current velocity, kinda remembering where it was heading. This gives each a momentum along a trajectory, making it veer towards somewhere between its best spot and the global best spot. You'll need to pick the fractions. Using w, for weight, since we're doing a weighted sum, and c1 and c2 for the other proportions, we have:

    vx,t+1 = wvt + c1(pt-xt)+c2(gt-xt)

If you draw the particles moving around you will see them swarm, in this case out of the paper bag. 

This is one of many ways to code your way out of a paper bag covered in my book. When a particle solves a problem, here being out of the bag, inspecting the x and y values gives a solution to the problem. PSO can be used for a variety of numerical problems. It's usually described as a stochastic optimisation algorithm. That means it does something random (stochastic) to find the best (optimal) solution to a problem. You can read more here.  






Tuesday 16 July 2019

Decision trees for feature selection

I asked twitter who is using decision trees and what for. Most were using them, unsurprisingly, to make decisions. It wasn't always clear how the trees themselves were built.

If you are armed with data, so that each row has some features and a category - either yes/no, or one of many classes - you can build a classifier from the data. There are various ways to decided how to split up the data. Nonetheless, each algorithm follows the same overall process. Start with a tree root node, with all the data, and add nodes with part of the data.

Then

  1. Pick a feature
  2. Split the data set, some to the left branch, some to the other branch (or branches) depending on the value of the feature
  3. If all the data at a node is in the same category (or almost all in the same category) form a leaf node
  4. Continue until each node is a leaf node.



This is a bit like a sorting algorithm: in quick sort, you choose a pivot value and split the data down one branch or the other, until you have single points at nodes. Here we don't choose a pivot value but features. The way to pick a feature can be based on statistics, information theory or even at random. At each step, you want to know if all the items in one category tend to have the same value or range of values of a feature. Once you are done you have a tree (or flow chart) you can apply to new data. Each way to split has various pros and cons. You can even build several trees. A random forest will build lots of trees and they vote on the class of new, unseen data. You could build your own voting system, using a variety of tree induction techniques. This might avoid some specific problems, like over-fitting from some techniques.

You can use decision tree induction in a variety of places, even if you don't want a full decision tree or ruleset. A rule set is a tree written in as a sequence of if statements. For example,

If currency is USD then data goes missing.

If you are moving a data source from one provider to another, and some data goes missing, can you spot what the missing items have in common? You could do a bit of manual investigation, say using pivot tables and data filters in Excel. However, a decision tree might find common features far more quickly than you can. This is a form of feature selection - using an algorithm to find pertinent features. Find a decision tree implementation in a language you know, or write one yourself, and have an experiment.

My book, Genetic Algorithms and Machine Learning for Programmers, has a chapter explaining how to build one type of decision tree. Take a look. Lots of machine learning frameworks also have tools to help you build decision trees. Next time you want to know what certain things have in common, try a decision tree and see you you learn. Machine learning is often about humans learning, rather than the machines.








Friday 3 May 2019

Log driven development

Everyone knows attempting to figure out what's happening by resorting to print statements is desperation. Everyone knows you should use TDD, BDD or some xDD to write code well, don't they? I am desperate. I suggest PDD, print driven development is slow, error prone but sometimes helpful.

I recently shoe-horned a new feature into a large existing code base. It's hard to make it run a small set of work and I wanted to check a config change create my new class via the dependency injection framework. Now, I could put in a break point and wait a while for it to get hit, or not. Fortuenately, the code writes logs. By adding a log line to the constructor, I could tell much more quickly that I'd got the config correct. Write the code, leave it running, have a look later. I am tempted to name this Log Driven Development.

On one hand, adding the moral equivalents of prints is a bad thing. Teedy Deigh mused on logging in Overload 126, saying

It is generally arbitrary and unsustainable and few people are ever clear what the exact requirements are, so it spreads like a cat meme.

And yet, if I add informative logging as things happen, I can see how the whole system hangs together. Jonathan Boccara mentioned finding logs when trying to understand an unfamiliar code base, in his talk at the ACCU Conference. Existing logs can help you understand some code. Looking at this from a different direction begs the question are my logs useful? As a first step seeing a log entry when I wired in my class felt like a win. It told me what I wanted to know. I must make sure the logs I write out as my class starts doing something useful are themselves useful.

This is a step in the direction of a walking skeleton style approach - get things running end to end, outside-in, then fill in the details. I'm much more comfortable with a TDD approach, but keeping an eye on the whole system and how it hangs together is important. Building up the new feature by writing useful logs means the logs will be useful when it's released. LDD - log driven development. Why not?















Wednesday 17 April 2019

ACCUConf 2019

The ACCU conference happened in Bristol again this year. For my first time ever, I was at a workshop. In fact, I ran a work shop with Chris Simons. We talked about Evolutionary Algorithms in practice. We gave a 90 minute talk later in the week, using the same Java framework (JCLEC), with slides here. High level summary: can your computer solve problems using a few random guesses and iteratively improve, using crossover (merging together previous attempts) and mutation (nudge things up or down, or flip bits). Answer yes: we managed to turn on all the bits in an array, code our way out of a paper bag (by firing virtual cannon balls based on an old Overload article), and finally in the 90 minute workshop, we generated code for Fizz Buzz.   

The full schedule is here. Having five tracks meant I missed lots, but talks will appear on the YouTube channel over time. I'll just give brief notes on a handful of talks I attended. The opening keynote was "Delivering software that is secure and usable - who’s job is it?" by  M Angela Sasse. Angela called out StackOverflow being functionally great but the security advice being bad, in contrast to using an official manual, wherein the security advice is great but it's functionally worse. This was based on measuring several developers attempting to use a software product. How can you actually measure security or usability? How are you currently measuring it? Mention was made of hard to follow security rules, which people tend to work around. Angela called for a way to reprogram the security experts. How good are they at conflict resolution? Do they have social marketing skills? Twitter devolved into quips about social engineering at that point.My final note says,
Programmers are tribal and seek approval. Try to trust and collaborate instead.

Next I'll mention "10 Techniques to Understand Code You Don’t Know" by Jonathan Boccara. He's written a book, which I've seen several people recommend. The 10 techniques fell into three groups: explore, speed read, and detail.
Exploring covered

  • using and finding the I/O frameworks, 
  • performing local analysis - getting the hang of one or two important functions, 
  • analysing call stacks to join the dots between modules. 

Speed reading covered

  • reading the end first - don't be put off by a long function, find the output or returns and worry about the rest later,
  • find frequent words, both count and span (total and lines with words)
  • filter on flow control - giving something like a table of contents for a book
  • scan for the main action - feel free to ignore catch blocks or elses, focus on the happy path

Finally, you sometimes need to start going into detail

  • try scratch refactoring
  • practice writing functions in the code
  • team up - strive for pair understanding

There was a discussion about flame graphs at the end, and he mentioned "How to read a book: the classic guide to intelligent reading" by Alder and van Doren. This points out you don't need to read a non-fiction book in order. Jump around, follow back links, jump straight in to what you want to learn. Very non-linear.

Next, I'll talk about "The anatomy of an exploit" by Patricia Aas. She started by mentioning the weird machine. You can see most programs as a finite state machine. An exploit jumps out of the finite states into other, unintended states. She looked at CWE-242; a list of potentially dangerous functions. The CWE is the common weakest enumeration, available online, listing things to avoid. Her talked pulled on things that might go wrong with gets or std::cin. Surprisingly, you get more warnings from C than C++. By disabling warnings, one at a time, we ended up with code to get a prompt. Once you have a shell on another machine, you can then do a variety of nefarious things.  She covered loads of things including ASLR; address randomisation, heap grooming and use after free. Security was a definite theme at this conference, and many developers understand far too little about it.

Herb Sutter gave Thursday's keynote on "De-fragmenting C++: Making exceptions more affordable and usable". He called out a divide between teams who can and who cannot use exceptions. Many libraries have a mix of exceptions and return code. He said "Pick a lane". C++ is supposed to be zero overhead

  • a feature only costs if it's used
  • it's better than coding it yourself

This is not true for exceptions. He considered the difference between program recoverable and non-recoverable errors. What can you do about stack exhaustion, for example? Who do you report problems to? Humans or code? Exceptions are automatic (a good thing) and invisible (a bad thing). He sketched out ways we could make exceptions have zero overhead. What this space.

Anthony Williams then talked about aysnc, executors and callbacks: "Here’s my number; call me, maybe. Callbacks in a multithreaded world". He called out a few things to be aware of. Does the order of your callbacks matter? Can you deregister them? He encouraged us to capture by value, rather than reference, unless we have a really good excuse.

At lunchtime, there was a book signing. I sold several copies of my book; "Genetic Algorithms and machine learning for programmers." Three others,  Anthony Williams, Ivan Cukic, and Jonathan Boccarra were also selling books, but I didn't get a chance to go talk to them. Thanks to ACCU for the chance to do this. I put mine in paper bags, and even wrote a receipt on one. The chapters in mine show how to code your way out of a paper bag, so it seemed sensible.



I gave a session with Chris Simons about how to teach your computer to code Fizz Buzz. We plan to write this up for ACCU's Overload magazine shortly.

On Friday, Paul Grenyer gave the keynote. He reminisced about people he'd met when he was an ACCU member, and all the things he'd done, some that worked and some that didn't, in Norwich, to grow the tech network. There's now a background discussion on accu-general and accu-members email lists about how to revive some things we used to do, and find new things to do, that will be valuable to the group. I'd love to see the mentored developers reboot.

Next, I went to "Interactive C++ : Meet Jupyter / Cling - The data scientist’s geeky younger sibling" by Neil Horlock. He talked about Code Club, and teaching people. This led nicely into using Cling/Jupyter to have notebooks for C++. Cling is an interactive (JITted) version of clang. I can't do his talk justice here. It was amazing. It managed to cope with templates, and a variety of things that blew my mind. He demonstrated using RISE to make RevealJS slides from a notebook, so I think I was watching a talk in a talk in a talk.

My notes have run out at this point. At the speakers' dinner we met EchoBorg. An actor (an echoborg) voiced the words of a chatbot. People volunteered to be interviewed to become an echoborg themselves. This set of cyberpunk style SciFi in my head. Again, I won't do it justice, but watching the conversation develop was incredibly interesting. Have a look at their websites:



I went to two talks on Saturday: "Windows Native API" by Roger Orr and
"Best practices when accessing Big Data or any other data!" by  Rosemary Francis. I was too tired to make notes by that point, and we left early, since we had a three hour drive home. Rog considered several ways to return 42 from a program and showed several steps that happen before main; something people don't always consider. He touched on security too. I didn't stay to the end of Rosemary's, but she was talking about tooling her company has developed to watch programs using big data and tracing bottlenecks. In my opinion, many data scientists make mistakes some programmers might avoid. Her first example was opening and closing a file in a loop. I wish machine learners and programmers could talk to each other more and help each other out.

I had a great conference, and the includecpp crew were there. I dipped in and out of the discord chat. It's lovely to see people supporting each other. Simple things, like chats about where to go for dinner.

Echoborg has left dystopian Sci-Fi short stories brewing in my head, and the Jupyter/cling talk left me with lots to explore. Thanks ACCUConf. Hope to be there next year.



Tuesday 12 March 2019

The ACCU's Overload magazine

ACCU is an organisation for programmers. Its original focus was C and C++, but now members use a variety of languages, talk about testing and process and how to keep learning. ACCU holds an annual conference in the UK, attended by people from around the world. There's even a YouTube channel of recorded talks from this.

As a member you get a discount for the conference, can volunteer to do book reviews, can participate in study groups, though these have been quiet lately, and get two magazines; the CVu members' magazine and Overload, which is open to anyone. There are also several local groups if you want to come along and meet us.

I've been a member for several years now. It's been a great networking opportunity and I have learn so much from other members. I love the magazines, and by starting to write for them myself, I stepped up my game. I began by writing book reviews, then tried some of the Student Code Critiques in CVu. Eventually, I wrote an Overload article, pulling together a discussion on the accu-general mailing list about floating point numbers.

I took on the role of Overload editor in 2012. We welcome articles from non-members as well as members. They are peer reviewed, meaning the author gets feedback, questions and suggestions. A surprisingly high number of writers have gone on to write books, myself included. (I mentioned I wrote a book about genetic algorithms and machine learning, yes?)

If you have an article you'd like to get published, let me know. We do accept existing blog posts, but the review team might well ask for slight improvements. There are some submission instructions here.

We welcome established writers as well as new writers. If you've never written an article, give it a go. You can learn a lot by trying to write something up. For example, as you try to explain something you may find gaps in your knowledge and understanding. Questions and suggestions from the review team will make your article better.

I love the ACCU and am looking forward to this year's conference, in just under a month. If you can't make the conference, find a local group, or consider joining the organisation. Or submit an article for Overload. Get involved.


ACCU Home page

Tuesday 26 February 2019

Code your way out of a paper bag

I attended nor(DEV):con, a tech conference in Norwich, last week. I gave a 45 minute talk which I called "Code your way out of a paper bag". A majority of my recent talks involve getting out of, and once into, a paper bag. I've used this as a vehicle to demonstrate some machine learning, AI and other algorithms.

I interviewed a candidate for a role a while ago, and the other interviewer commented the interviewee couldn't code their way out of a paper bag afterwards. Jeff Atwood, the co-founder of Stack Overflow, has made similar comments:

We're tired of talking to candidates who can't program their way out of a paper bag.

It's a shorthand way for saying people can't do something simple. First, programming isn't always simple. Second, how do you prove you can code your way out of a paper bag?

Use Genetic Algorithms

You could do something straightforward, like make a line zoom out of a paper bag, provided you have a way to draw things.



Alternatively, you could do something more complicated. I used genetic algorithms. If you fire a cannon ball, choosing the angle and initial velocity upfront, it may, or may not end up outside a paper bag, provided you use a tiny, virtual cannon, placed at the bottom of a paper bag. The cannon balls might just fire through the bag, but perhaps you can get them to go over the side if you choose the right numbers. There's a free excerpt from my book here talking this through.

Now, you could guess a few (angle, velocity) pairs and see what happens.  I got three people in the audience to do this. You could then swap the angle and velocity of the better tries, and maybe change one or both of the numbers slightly. This covers the essence of how genetic algorithms work. The swap is called crossover, drawing on the idea of chromosomes recombining when living organisms breed. The slight tweak is called mutation, and draws on the idea of random fluctuations in DNA . Darwin's evolution says these random changes give rise to new species. The fitter ones survive. All you then need is a function to decide how fit a solution is, often called a fitness function. Start with some random pairs, select some fitter pairs, use crossover and mutation for a while. In the end, the attempts might get better.

Over-engineered?


You could argue my approach is somewhat over-engineered. You'd be right. However, it's a nice self-contained example to demonstrate gentic algorithms. It was filmed, so will be up online soon. I'm not sure how clear the audience coming out with their attempts will be on film. Hopefully enough for you to get the idea.

How do you   out of a paper bag? Tell twitter, or tell me.

Have a look at my book too.



Wednesday 6 February 2019

CppOnSea

CppOnSea 2019

Phil Nash organised a new conference, CppOnSea, this year. I was lucky enough to be accepted to speak, so attended to two conference days, but not the workshops.


There were three tracks, along with a beginners track, run by Tristan Brindle, who organises the C++ London Uni, which is a great resource for people who want to learn C++. I'll do a brief write-up of the talks I attended.


The opening keynote was by Kate Gregory, called "Oh The Humanity!" She made us think about the words we use. For example, Foo and Bar trace back to military usage, hinting at people putting their lives on the line. Perhaps we need better names for our variable and functions. We are not fighting a war. What about one letter variable names? 'k. Nuff said. What about errorMessage? If you call the helpMessage instead, how does that affect your thinking?
Kate was also talking about trying to keep the code base friendly, to increase confidence:



I went to see Kevlin Henney next. I have no idea how to summarise this. He covered so many thing. What does structured programming really mean? By looking back to various uses and abuses of goto, By highlighting the structure in various code snippets, he was emphasising some styles make the structure and intent easier to see.

I saw Andreas Fertig next. Inspired by Matthew CompilerExplorer's Godbolt, he has created https://cppinsights.io/. Try it out. It unwraps some of the syntactic sugar, so you can see what the compiler has created for, say, a range based for loop. This can remind you where you might be creating temporaries or have references instead of copies or vice versa, without dropping down into assembly. Do you know the full horror of what might be going on inside a Singleton?

This led to an aside about statics and the double checked locking pattern. His headline point was the spirit of C++ is "you pay only for what you use", so be clear about what you are using. The point isn't that the new language features are expensive. They are often cheaper than old skool ways of going things. Just try to be clear about what's going on under the hood. Play with the insights tool.

Next up, I saw Barney Dellar, talking about strong types in C++. His slides are probably clear enough by themselves, since they have thorough speaker notes. My main note to myself says "MIB: mishap investigation board", which amused me. He was talking about the trouble that can happen if you have doubles, or similar, for all your types, like mass and force:

double CalculateForce(double mass);

It's really easy to use the wrong units or send things in in the wrong order. By creating different types, known as strong types, you can get the compiler to stop you making mistakes. Use a template with tags, you can write clear code avoiding these mistakes.

Next up was a plenary talk by Patricia Aas on Deconstructing Privilege. She's given the talk before, so you'll be able to find the slides or previous versions on YouTube. Her take is that privilege is about things that haven't happened to you. Many people get defensive if you say they have been privileged,  but this way of framing the issue gives a great perspective. Loads of people turned up and listened. Maybe surprising for a serious geek C++ conference, but the presence of https://www.includecpp.org/ ensured there were many like minded people around. If you are privileged, listen and try to help. And be careful asking intrusive questions if you meet someone different to you.

After quite a heavy, but great talk, I was "in charge" of the lightning talks. Eleven people got slots. More volunteered, but there wasn't time for every one:

Simon Brand; C++ Catastrophes: A Poem.
Odin Holmes; volatile none of the things
Paul Williams; std::pmr
Heiko Frederik Bloch; the finer points of parameter packs
Barney Dellar;imposter syndrome or mob programming
Matt Godbolt; "words of power"
Kevlin Henney; list
Neils Dekker; noexcept considered harmful???
Patricia Aas; C++ is like JavaScript
Louise Brown; The Research Software Engineer - A Growing Career Path in Academia
Denis Yaroshevskiy; A good usecase for if constexpr


My heartfelt thanks to Jim from http://digital-medium.co.uk and Kevlin "obi wan kenobi" Henney for helping me switch between powerpoint, power point in presenter mode and the pdfs, and getting them to show on the main screen and my laptop. No body knows what was happening with the screen on the stage for the speaker. If you ever attend a conference, do volunteer to give a lightning talk. Sorry to the people we didn't have time for.

Day one done. Day two begun.First up, for me, after missing my own talk pitch, was Nico Josuttis. Don't forget his leanpub C++17 book. It's still growing. He talked about a variety of C++17 features. The standout point for me was the mess you can get into with initialisation. He's using {} everywhere, near enough. Like

for (int i{0}; i<n; ++i)
{
}

Adding an equals can end up doing horrible things.

Much as I wanted to go see Simon Brand, Vittorio Romeo and Hana Dusikova (with slide progression by Matt Godbolt) next, I had a talk to do myself. I managed to diffuse my way out of a paper bag, while reminding us why C's rand is terrible, how useful property based testing can be, using some very simple mathematics: adding up and multiplying. This was based on a chapter of my book, and you can download the source code from that page if you want, even if you don't buy the book. I used the SFML to draw the diffusing green blobs. Sorry for not putting up a list of resources near the end.

I attempted to go to Guy Davidson's Linear algebra talk next, but the room was packed and I was a bit late. I heard great things about this. In particular, how important it is to design a good interface if you are making libraries.

My final choice was Clare Macrae's "Quickly testing legacy code". This was my unexpected hidden gem of the conference. She talked about approval testing. This compares a generated file to a gold standard file  and bolts straight into googletest or Catch. It's available for several other languages. It generates the file on your first run, allowing you to get almost anything, provided it writes out a file you can compare, under test. Which then means you can start writing unit tests, if you need to change the code a bit. Changing legacy code to get it under test, without a safety hardness is dangerous. This keeps you safer. Her world involves Qt and chemical molecules visualisations. These can be saved as pngs, so she can check she hasn't broken anything. She showed how you can bolt in custom comparators, so it doesn't complain about different generated dates and does a closer than the human eye could notice RGB difference. Her code samples from the talk are here.  I've not seen this as a formal framework before. Her slides were really clear and she explained what she was up to step by step. Subsequently twitter has been talking about this a fair bit, including adding support for Python3.

Matt "Compiler Explorer" Godbolt gave the closing keynote. Apparently, the first person Phil Nash has met who;s first conference talk was a keynote. He also had some AV issues:


If you've not encountered the compiler explorer before, try it out. You can chose which compiler you want to point your C++ code at and see what it generates. You need a little knowledge of the "poetry" it generates. More lines doesn't mean slower code. His tl;dr; message was many people spread rumours about what's slow, for example virtual functions. Look and see what your compiler actually does, rather than stating things that were true years ago. Speculative de-virtualization is a thing. Your compiler might decide you only really have one likely virtual function you'll call so checks the address and does not then have the "overhead" it used to years ago. He also demonstrated what happened to various bit counting algos - most got immediately squashed down to one instruction, no matter how clever they looked. How many times have you been asked to count bits at interview. Spin up Godbolt and explore.This really shows you need to keep up to date with your knowledge. Something that was true ten years ago may not longer hold with new compiler versions. Measure, explore, think.

There was a lovely supportive atmosphere and a variety of speakers. People were brave enough to ask questions, and only a few people were showing off they though they new something the speaker didn't.

I'll try to back fill links to slides as I get them. Thanks to Phil for arranging this.

Did I mention I wrote a book?


Saturday 26 January 2019

xkcd-style plots in MatPlotLib

Most programmers I know are familiar with xkcd, the webcomic of romance, sarcasm, math, and language. In order to create diagrams for my machine learning book, I wanted a way to create something I could have fun with.

I discovered that Python's MatPlotLib library has an xkcd style, which you simply wrap round a plot. This allowed me to piece together what I needed using line segments, shapes, and labels.

Given a function, f, which draws what you need on some axes ax, use the style like this, and you're done:

with plt.xkcd():
    fig = plt.figure()
    ax = fig.add_subplot(1,1,1)
    f(ax)

    plt.show()

For example to explain what happens when you fire cannons at different angles:


I gave a talk at Skillsmatter, called "Visualisation FTW", which was recorded, so you can watch it if you want. I also wrote this up for ACCU's CVu magazine. The ACCU runs an annual best article survey, and I was runner up, which was a pleasant surprise. You need to be a member to view the article, but it covers the same ground as the short talk at Skillsmatter.

Buy a copy of my book, or go play with xkcd style pictures. Have fun; I did.