Big Data, blockchain, machine learning – explaining terms in hedgehogs

What do artificial intelligence, blockchain, Big Data, and hedgehogs have in common? Lots of things! We will explain the meaning of complex words by the example of these funny animals. Four minutes of reading and you are already flaunting the terms in the company of your colleagues.
Artificial Intelligence and machine learning are increasingly being used by companies in marketing. We will tell you about how and why they are used in your work in the following articles. We have prepared a glossary with illustrative examples to better understand how these technologies work. We explain the terms so simply that it will be clear to the common sense.

Artificial Intelligence.

aka AI, Artificial Intelligence, AI
There is no exact definition, just like in philosophy. It all depends on what exactly is meant by the words “intelligence” and “intellect”. Is it just a property of humans? Or are some animals also intelligent and have intelligence? In general, artificial intelligence is the property of systems to mimic either human mental processes or intelligent behavior and the ability to make choices.

Imagine you dream of getting a hedgehog. The only thing that stops you is your allergy to these animals. Then friends suggest buying a robotic hedgehog. It should be able to reproduce the behavior of a real hedgehog: snorting, expressing his discontent, curled up in a ball when they try to pet it, stomp loudly paws at night, to love his master and bite strangers. If your pet can do all that, you can safely say it has artificial intelligence.

Big Data

It seems immediately clear that big data is a lot of data. But it’s not so simple. For starters, how much is a lot? Three, ten, a million, a billion? And what – megabytes, gigabytes, terabytes? There’s no consensus about that. Some people think it is when the data cannot be counted on one computer, others think it is when the daily flow of information exceeds 100 GB per day. It is generally accepted that big data is not only data but also tools, approaches and methods of information processing.

Let’s say there is a task from Greenpeace: to count every day the white-bellied, long-eared and African dwarf hedgehogs living in the reserve. This is to compare their numbers with each other to see if the population of one of the hedgehog species is declining.

Each day, you collect all the hedgehogs and distribute them to three rooms.

If there are 100 hedgehogs in the preserve, the task seems easy. With 1000 hedgehogs, it gets harder (don’t forget, they can run out). And if you have the world’s largest hedgehog sanctuary, you can’t count the animals by hand anymore – at that point, they become big data. You’ll need a big data tool, a smart automatic hedgehog sorter. It will not only distribute and count animals, but also find new dependencies, such as seasonal variations in the hedgehog population.

Blockchain

Blockchain is not just about cryptocurrency! It is a technology for storing information in a chain of blocks. Each block contains data about the whole previous chain: it is known what is stored in the previous blocks, who and when created, moved or changed the information. All information is duplicated on different computers, possibly in different countries. This makes it impossible to forge them.

Imagine you have a best friend – Sonic the Hedgehog. A neighbor comes running in, threatening to call the police and screaming that he has a video of Sonic stealing a huge diamond buried in the garden two days ago. But that can’t be: you and the hedgehog have been watching soap operas and eating pizza all evening.

Suppose all the neighborhood surveillance footage is stored using blockchain technology: one day’s footage from one camera is one block, a new block contains the new day’s footage and the identifier code of the previous block. The police officer accesses all the cameras that show the neighbor’s garden. He then looks for the code from the right camera, looks at the identifier of the video where Sonic is stealing the diamond, and realizes that the video is rigged. Now he can go on happily eating pizza, and the neighbor will work off his punishment for slander!

Machine Learning
A.k.a. Machine Learning, ML.
These are algorithms that learn by themselves or with the help of a teacher. It looks something like this:

Data is collected.
Divided about 80/20 for training and testing.
A model suitable for our problem is chosen.
The model is trained.
The results are evaluated and sent back for refinement if they are not accurate enough.
Machine learning techniques can be used to teach computers to recognize hedgehogs or to draw them. Below we describe two different approaches with concrete examples.

  1. Gradient Boosting
    This is a way of building algorithms one after the other. Each new algorithm is created to correct the flaws of the previous one.

For example, we come up with an algorithm that determines the breed of hedgehogs. We start by looking at their size: large hedgehog, medium hedgehog, or small hedgehog. This is our first simple decision tree. Then a few more:

  • by length of needles;
  • the basic color;
  • the shape of the lugs.
  • Then we combine all the attributes into one tree and get a blank, as if we were making a test “What kind of hedgehog are you? This test will not cover all breeds, so we have to build another tree with the resulting error. Each new tree will reduce the error and more accurately determine the breed of hedgehog.
  1. Neural networks
    This is analogous to the neural network of the human brain. Many small neurons solve their simplest operations. They are interconnected and together perform complex functions.

Suppose we took many pictures and drew hedgehogs, showed them to a computer and said, “Look, they are all hedgehogs. It analyzed the pictures, superimposed them on each other, and distinguished the features of a hedgehog. The result was a representation – it’s called a pile of twists and turns. A person who looks at it probably won’t understand why the algorithm sees hedgehogs that way. He will only see a set of pixels. A convolutional neural network like this can now be shown a video of a nature preserve, and it will count how many hedgehogs live there.

Machine Learning Model

Aka ML model.
This is a particular trained algorithm. A model with its own set of features (functions) solves only the type of problems for which it was built. Like a hedgehog that has been trained to catch a certain kind of caterpillar.

Feature

This is a slang term for features used in models. Remember when we were building trees in gradient boosting to determine the breed of hedgehog? Well, the shape of the lugs is a feature. So is the length of the needles.