Life lessons from chickens

It’s no secret I keep chickens. I find this thoroughly rewarding, as not only are they lovely primordial creatures, but with some love and care will endow a supply of tasty eggs for around the same price as supermarket eggs.

chickens2.jpg
A bird’s-eye view

Chickens are very odd creatures, and one cold morning recently I wondered what (if any) life lessons we might be able to learn from them. Given how many millennia chickens have been around humans, it is no surprise that we have many sayings involving chickens.Some of my favourites:

– “It is better to be the head of chicken than the rear end of an ox” – Japanese Proverb

– “Business is never so healthy as when, like a chicken, it must do a certain amount of scratching for what it gets” – Henry Ford

– “A hen is only an egg’s way of making another egg.” – Samuel Butler

– “The key to everything is patience. You get the chicken by hatching the egg, not by smashing it.”

– “Don’t count your chickens before they hatch”

– “Don’t put all your eggs in one basket”

There is some hilarious truth in many of these statements, in an divination or i-ching way perhaps considering chickens can help us reflect on our own decisions. After all, why did the chicken cross the road?

img_20161011_165954
Still scared

More than these amusing proverbs, chickens fear change. It takes months for them to get used to eating out of a human’s hand, and they scatter very easily (hence the playground speak for scared, “are you chicken?”). In this way, they seem to be very similar to people. A quick google of “people fear change” leads to 223 million pages.

A famous study on “Framing” by behavioural economists Daniel Kahneman and Amos Tversky suggests that our loss aversion (desire to minimise perceived change) almost always altered our choices even when the other choice was identical. David McRaney summarised the study nicely:

Imagine the apocalypse is upon you. Some terrible disease was unleashed in an attempt to cure male pattern baldness. The human population has been reduced to 600 people. Everyone is likely to die without help. As one of the last survivors you meet a scientist who believes he has found a cure, but he isn’t sure. He has two versions and can’t bear to choose between them. His scientific estimates are exact, but he leaves the choice up to you. Cure A is guaranteed to save exactly 200 people. Cure B has a 1/3 probability of saving 600, but a 2/3 probability of saving no one. The fate of hairlines and future generations is in your hands. Which do you pick? Ok, mark your answer and let’s reimagine the scenario. Same setup, everyone is going to die without a cure, but this time if you use Cure C it is certain exactly 400 people will die. Cure D has a 1/3 probability of killing no one, but a 2/3 probability killing 600. Which one?

Most people chose Cure A in the first scenario and Cure D in the second, but both situations presented are actually the same with different framing. The results showed how humans choose the option that minimises loss: the one with the least perceived change. According to lifehacker, because we’re so opposed to inciting change, logic can go right out the window.

By being aware of this bias, perhaps we can avoid the perils of “being a chicken”.

 

Lean Manufacturing: Limiting innovation?

I’ve been running a “lean” business for 6 months now, and I’ve noticed that Lean Manufacturing principles applied to software development could lead to bad business. Let me explain:

Primarily, my concern centers around the lean manufacturing principle of waste reduction. The constant strive to reduce waste makes sense in an industrial production line, but does it make sense in a startup or exploratory environment?

An example

Let’s say there are 2 possible features you can work on. Feature 1 has a 90% chance of delivering £1 value, and Feature 2 has 10% chance of delivering £100 value.

Let’s say that the features take the same time to develop. From the maths,

Feature 1 “value” = 90% of £1 = £0.90

Feature 2 “value” = 10% of £100 = £10

And yet even with this understanding, because of the implicit risk and waste aversion of lean, we would say “there’s a 90% chance Feature 2 will be wasteful, whereas Feature 1 is sure to not be wasteful, therefore Feature 1 is a better idea”.

Good outcomes, but not as good as they could be

The waste reduction aspect of lean manufacturing gives us a local optimisation, much like gradient descent. Imagine a ball on a hill, which will roll downhill to find the bottom. This is ok, and it will find a bottom (of the valley), but maybe not the bottom (of the world, maybe the Mariana trench).  In that sense it is locally good, but not globally optimal.

The way mathematicians sometimes get round this is by repeatedly randomly starting the ball in different places: think a large variety of lat-longs. Then you save those results and take the best one. That way you are more likely to have found a global optimum.

So I’m wondering about whether this kind of random restarting makes sense in a startup world too. I guess we do see it in things like Google’s acquisitions of startups, Project Loon, etc. Perhaps we/I should be doing more off-the-wall things.

Closing commentary

Perhaps it isn’t so odd that Lean Manufacturing has “reduce waste” as a principle… In a production line environment, reduction of waste is the same as increasing value.

Still, if the optimisation problem is “maximise value” this leads to different outcomes than “minimise waste”. I would argue we should, in almost every case, be focusing on maximising value instead.

As we’ve seen with following the rituals rather than the philosophy and mindset of agile, it is beneficial to actually think about what we’re doing rather than applying things without understanding.

Comments below please, I know this may be a bit controversial…

 

Climate Change

In the past week, I’ve been to some excellent talks. The first was on Biomarkers at the Manchester Literary and Philosophical Society, and the second was Misinformation in Climate Change at Manchester Statistical Society. And both of these followed the IMA’s Early Career Mathematicians conference at Warwick, which had some excellent chat and food for thought around Big Data and effective teaching in particular.

Whilst I could share my learnings about biomarkers for personalised medicine, which makes a lot of sense and I do believe it will help the world, instead I will focus on climate change. It was aimed at a more advanced audience and had some excellent content, thanks Stephan Lewandowski!

There are a few key messages I’d like to share.

Climate is different to weather

This is worth being clear on: climate is weather over a relatively long period of time. Weather stations very near to one another can have very different (temperature) readings over time. Rather than looking at the absolute value, if you instead look at the changes in temperature you will be able to find correlations. It is these that give us graphs such as:

climate1
Note it is variation rather than absolute

Misinformation

Given any time series of climate, it is possible to find local places where the absolute temperate trend goes down, particularly if you can pick the time window.

Interestingly, Stephan’s research has showed that belief in other conspiracy theories, such as that the FBI was responsible for the assassination of Martin Luther King, Jr., was associated with being more likely to endorse climate change denial. Presumably(?) this effect is related to Confirmation Bias. If you’re interested in learning more, take a look at the Debunking Handbook.

Prediction is different to projection

According to Stephan, most climate change models are projections. That is, they use the historical data to project forward what is likely to happen. There are also some climate change models which are predictions, in that they are physics models which take the latest physical inputs and use them to predict future climate. These are often much more complex…

Climate change is hard to forecast

I hadn’t appreciated also how difficult to forecast El Niño is. El Niño is warming of the eastern tropical Pacific Ocean, the opposite (cooling) effect being called La Niña. Reliable estimates for El Nino are available around 6 months away, which given the huge changes that happen as a result I find astonishing. The immediate consequences are pretty severe:

El-Nino.jpg
Source: welt hunger hilfe

As you can see from the above infographic, it turns out that El Niño massively influences global temperatures. Scientists are trying to work out if there is a link between this and climate change (eg in Nature). Given how challenging this one section of global climate is, it is no wonder that global climate change is extremely difficult to forecast. Understanding this seems key to understanding how the climate is changing.

The future

In any case, our climate matters. In as little as 30 years (2047), we could be experiencing climatically extreme weather. Unfortunately since CO2 takes a relatively long time to be removed from the atmosphere, even if we stopping emitting CO2 today we would still have these extreme events by 2069. Basically, I think we need new tech.

 

 

Open Data

In a previous post, several months ago, we talked about Chaos and the Mandelbrot Set: an innovation brought about by the advent of computers.

In this post, we’ll talk about a present-day innovation that is promising similar levels of disruption: Open Data.

Open Data is data that is, well, open, in the sense that it is accessible and usable by anyone. More precisely, the Open Definition states:

A piece of data is open if anyone is free to use, reuse, and redistribute it – subject only, at most, to the requirement to attribute and/or share-alike

The point of this post is to share some of the cool resources I’ve found, so the reader can take a look for themselves. In a subsequent post, I’ll be sharing some of the insights I’ve found by looking at a small portion of this data. Others are doing lots of cool things too, especially visualisations such as those found on http://www.informationisbeautiful.net/ and https://www.reddit.com/r/dataisbeautiful/.

Sources

One of my go-to’s is data.gov.uk. This includes lots of government-level data, of varying quality. By quality, I mean usability and usefulness. For example, a lat-long might be useful for some things, a postcode or address for other things, or an administrative boundary for yet others. This means it can be very hard to “join” the data together, as the way they store something like “location” is many different ways. I often find myself using intermediate tables that map lat-long into postcodes etc., which takes time and effort (and lines of code).

Another nice meta-source of datasets is Reddit, especially the datasets subreddit. There is a huge variety of data there, and people happy to chat about it.

For sample datasets, I use the ones that come with R, listed here. The big advantage with these is they are neat and tidy, so they don’t have missing values etc and are nicely formatted. This makes them very easy to work with. These are ideal for trying out new techniques, and are often used in worked examples of methods which can be found online.

Similarly useful are the kaggle datasets, which cover loads of things from US election polls to video games sales. If you are inclined they have competitions which can help structure your exploration.

A particularly awesome thing if you’re into social data is the European Social Survey. This dataset is collected through a sampled survey across Europe, and is well established. It is conducted every 2 years, since 2002, and contains loads of cool stuff from TV watching habits to whether people voted. It is very wide (ie lots of different things) and reasonably long (around 170,000 respondents), so great fun to play with. They also have a quick analysis tool online so you can do some quick playing without downloading the dataset (it does require signing up by email for a free login).

Why is Open Data disruptive?

Thinking back to the start of the “information age”, the bottleneck was processing. Those with fast computers had the ability to do stuff noone else could do. Technology has made it possible for many people to get access to substantial processing power for very cheap.

Today the bottleneck is access to data. Google is making their business around mastering the world’s data. Facebook and twitter are able to exist precisely because they (in some sense) own data. By making data open, we start to be able to do really cool stuff, joining together seemingly different things and empowering anyone interested. Not only this, but in the public sector, open data means citizens can better hold government officials to account: no bad thing. There is a more polished sales pitch on why open data matters at the Open Data Institute (and they also do some cool stuff supporting Open Data businesses).

Some dodgy stuff

There are obviously concerns around sharing personal data. Deepmind, essentially a branch of Google at this point, has very suspect access to unanonymised patient data. Google also recently changed the rules, making internet browsing personally identifiable:

We may combine personal information from one service with information, including personal information, from other Google services – for example to make it easier to share things with people you know. Depending on your account settings, your activity on other sites and apps may be associated with your personal information in order to improve Google’s services and the ads delivered by Google.

Source: https://www.google.com/policies/privacy/

We’ve got to watch out, and as ever be mindful about who and what we allow our data to be shared with. Sure, this usage of data makes life easier… but at what privacy cost.

allyourdataarebelongtous
PRISM